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Preface 


The  goal  of  this  book  is  to  introduce  the  reader  to  the  intellectual  beauty,  and  philosophical  implications, 
of  the  fact  that  nature  obeys  variational  principles  that  underlie  the  Lagrangian  and  Hamiltonian  analytical 
formulations  of  classical  mechanics.  These  variational  methods,  which  were  developed  for  classical  mechanics 
during  the  lSth  — 19th  century,  have  become  the  preeminent  formalisms  for  classical  dynamics,  as  well  as  for 
many  other  branches  of  modern  science  and  engineering.  The  ambitious  goal  of  this  book  is  to  lead  the  student 
from  the  intuitive  Newtonian  vectorial  formulation,  to  introduction  of  the  more  abstract  variational  principles 
that  underlie  the  Lagrangian  and  Hamiltonian  analytical  formulations.  This  culminates  in  discussion  of  the 
contributions  of  variational  principles  to  the  development  of  relativistic  and  quantum  mechanics.  The  broad 
scope  of  this  book  attempts  to  unify  the  undergraduate  physics  curriculum  by  bridging  the  chasm  that 
divides  the  Newtonian  vector-differential  formulation  and  the  integral  variational  formulation  of  classical 
mechanics,  and  the  corresponding  chasm  that  exists  between  classical  and  quantum  mechanics.  Powerful 
variational  techniques  in  mathematics,  that  underlie  much  of  modern  physics,  are  introduced  and  problem 
solving  skills  are  developed  in  order  to  challenge  students  at  the  crucial  stage  when  they  first  encounter  this 
sophisticated  and  challenging  material.  The  underlying  fundamental  concepts  of  classical  mechanics,  and 
their  applications  to  modern  physics,  are  emphasized  throughout  the  course. 

A full  understanding  of  the  power  and  beauty  of  variational  principles  in  classical  mechanics,  is  best 
acquired  by  first  learning  the  concepts  of  the  variational  approach,  and  then  applying  these  concepts  to 
many  examples  in  classical  mechanics.  Classical  mechanics  is  the  ideal  topic  for  learning  the  principles  and 
the  power  of  using  the  variational  approach  prior  to  applying  these  techniques  to  other  branches  of  science 
and  engineering.  The  underlying  philosophical  approach  adopted  by  this  book  was  espoused  by  Galileo 
Galilei  "You  cannot  teach  a man  anything;  you  can  only  help  him  find  it  within  himself." 

The  development  of  this  textbook  was  influenced  by  three  textbooks:  " The  Variational  Principles  of 
Mechanics"  by  Cornelius  Lanczos  (1949)  [La49],  " Classical  Mechanics"  (1950)  by  Herbert  Goldstein[Go50], 
and  " Classical  Dynamics  of  Particles  and  Systems " (1965)  by  Jerry  B.  Marion [Ma65].  Marion’s  excellent 
textbook  was  unusual  in  partially  bridging  the  chasm  between  the  outstanding  graduate  texts  by  Goldstein 
and  Lanczos,  and  a bevy  of  introductory  texts  based  on  Newtonian  mechanics  that  were  available  at  that 
time.  The  present  textbook  was  developed  to  cover  the  techniques  and  philosophical  implications  of  the 
variational  approaches  to  classical  mechanics,  with  a breadth  and  depth  close  to  that  provided  by  Goldstein 
and  Lanczos,  but  in  a format  that  better  matches  the  needs  of  the  undergraduate  student.  An  additional 
goal  is  to  bridge  the  gap  between  classical  and  modern  physics  in  the  undergraduate  curriculum. 

This  book  was  written  in  support  of  the  physics  junior/senior  undergraduate  course  P235W  entitled 
"Variational  Principles  in  Classical  Mechanics"  that  the  author  taught  at  the  University  of  Rochester  between 
f993  — 2015.  These  lecture  notes  were  distributed  to  students  to  allow  pre-lecture  study,  facilitated  accurate 
transmission  of  the  complicated  formulae,  and  minimized  note  taking  during  lectures.  These  lecture  notes 
evolved  into  the  present  textbook  that  was  used  for  this  course.  The  target  audience  of  the  course,  upon 
which  this  textbook  is  based,  typically  comprised  « 70%  junior/senior  undergraduates,  « 25%  sophomores, 
< 5%  graduate  students,  and  the  occasional  well-prepared  freshman.  The  target  audience  was  physics 
and  astrophysics  majors,  but  it  attracted  a significant  fraction  of  majors  from  other  disciplines  such  as 
mathematics,  chemistry,  optics,  engineering,  music,  and  the  humanities.  As  a consequence,  the  book  includes 
appreciable  introductory  level  physics,  plus  mathematical  review  material,  to  accommodate  the  diverse 
range  of  prior  preparation  of  the  students.  This  textbook  includes  material  that  extends  beyond  what 
reasonably  can  be  covered  during  a one-term  course.  This  supplemental  material  is  presented  to  show  the 
importance  and  broad  applicability  of  variational  concepts  to  classical  mechanics.  The  book  includes  162 
worked  examples  to  illustrate  the  concepts  presented.  Advanced  group-theoretic  concepts  are  minimized  to 


XVII 


PREFACE 


xviii 

better  accommodate  the  mathematical  skills  of  the  typical  undergraduate  physics  major.  For  compatibility 
with  modern  literature  in  this  field,  this  book  follows  the  widely-adopted  nomenclature  used  in  "Classical 
Mechanics"  by  Goldstein[Go50],  with  recent  additions  by  Johns[Jo05]. 

The  book  is  broken  into  four  major  sections.  This  first  review  section  sets  the  stage  by  including  a 
brief  historical  introduction  (chapter  1),  review  of  the  Newtonian  formulation  of  mechanics  plus  gravitation 
(chapter  2),  linear  oscillators  and  wave  motion  (chapter  3),  and  an  introduction  to  non-linear  dynamics 
and  chaos  (chapter  4).  Extensive  reading  assignments  are  assigned  to  minimize  the  time  spent  on  this 
review  of  Newtonian  vectorial  mechanics.  Building  on  the  introductory  section,  the  second  section  of  the 
book  introduces  the  variational  principles  of  analytical  mechanics  that  underlie  this  book.  It  includes  an 
introduction  to  the  calculus  of  variations  (chapter  5),  the  Lagrangian  formulation  of  mechanics  with  appli- 
cations to  holonomic  and  non-holonomic  systems  (chapter  6),  a discussion  of  symmetries,  invariance,  plus 
Noether’s  theorem  (chapter  7)  and  an  introduction  to  the  Hamiltonian  and  the  Hamiltonian  formulation 
of  mechanics  plus  the  Routhian  reduction  technique  (Chapter  8).  The  third  section  of  the  book,  applies 
Lagrangian  and  Hamiltonian  formulations  of  classical  dynamics  to  central  force  problems  (chapter  9),  mo- 
tion in  non-inertial  frames  (chapter  10),  rigid-body  rotation  (chapter  11),  and  coupled  oscillators  (chapter 
12).  The  final  section  of  the  book  discusses  Hamilton’s  Principle  plus  advanced  applications  of  Lagrangian 
mechanics  (chapter  13),  Hamiltonian  mechanics  including  Poisson  brackets,  Liouville’s  theorem,  canonical 
transformations,  Hamilton- Jacobi  theory,  the  action-angle  technique  (chapter  14),  and  classical  mechanics 
in  the  continua  (chapter  15).  This  is  followed  by  a brief  review  of  the  revolution  in  classical  mechanics  intro- 
duced by  Einstein’s  theory  of  relativistic  mechanics.  The  extended  theory  of  Lagrangian  and  Hamiltonian 
mechanics  is  used  to  apply  variational  techniques  to  the  Special  Theory  of  Relativity  followed  by  a superficial 
introduction  to  the  concepts  of  General  Theory  of  Relativity  (chapter  16).  The  book  finishes  with  a brief 
review  of  the  role  of  variational  principles  in  bridging  the  gap  between  classical  mechanics  and  quantum 
mechanics,  (chapter  17).  These  advanced  topics  extend  beyond  the  typical  syllabus  for  an  undergraduate 
classical  mechanics  course.  The  reason  for  introducing  these  advanced  topics  is  to  stimulate  student  interest 
in  physics  by  giving  them  a glimpse  of  the  physics  at  the  summit  that  they  have  struggled  to  climb.  This 
glimpse  illustrates  the  breadth  of  classical  mechanics,  and  the  role  that  variational  principles  have  played 
in  the  development  of  classical,  relativistic,  quantal,  and  statistical  mechanics.  These  final  supplemental 
lectures  illustrate  the  beauty  and  unity  of  classical  mechanics,  and  the  foundation  that  classical  mechanics 
has  provided  to  the  development  of  modern  physics.  The  appendices  summarize  aspects  of  the  mathematical 
methods  that  are  exploited  in  classical  mechanics. 

The  present  textbook  contains  more  material  than  required  for  a junior/senior  undergraduate  classical 
mechanics  course,  and  thus,  it  could  serve  as  the  text  for  a graduate  course  by  focussing  the  course  on  the 
variational  principles  covered  by  chapters  5 — 17.  The  partitioning  and  ordering  of  the  topics  in  the  book 
are  the  result  of  many  permutations  tried  while  teaching  classical  mechanics  for  many  years.  Chapters  1 
through  3,  plus  the  mathematical  appendices,  are  used  as  reading  assignments  during  the  first  three  weeks 
of  class  to  minimize  the  time  spent  reviewing  Newtonian  mechanics.  This  maximizes  the  class  time  available 
to  cover  the  variational  approach,  that  is,  chapters  5 through  14.  The  brief  reviews  of  the  mechanics  in  the 
continua,  and  the  transition  to  quantum  mechanics,  provide  the  student  with  a glimpse  of  the  implications 
of  analytical  mechanics  to  these  more  advanced  topics. 

Information  regarding  the  associated  P235  undergraduate  course  at  the  University  of  Rochester  is  avail- 
able on  the  web  site  at  http://www.pas.rochester.edu/~cline/P235/index.shtml.  Information  about  the 
author  is  available  at  the  Cline  home  web  site:  http://www.pas.rochester.edu/~cline/index.html. 

The  author  thanks  Meghan  Sarkis  who  prepared  many  of  the  illustrations,  Joe  Easterly  who  designed  the 
book  cover  plus  the  webpage,  and  Moriana  Garcia  who  organized  publication.  Andrew  Sifain  developed  the 
diagnostic  workshop  questions.  The  author  appreciates  the  permission,  granted  by  Professor  Struckmeier,  to 
quote  his  published  article  on  the  extended  Hamilton-Lagrangian  formalism.  The  author  acknowledges  the 
feedback  and  suggestions  made  by  many  students  who  have  taken  this  course,  as  well  as  helpful  suggestions 
by  his  colleagues;  Andrew  Abrams,  Adam  Hayes,  Connie  Jones,  Andrew  Melchionna,  David  Munson,  Alice 
Quillen,  Richard  Sarkis,  James  Schneeloch,  Steven  Torrisi,  Dan  Watson,  and  Frank  Wolfs.  These  lecture 
notes  were  typed  in  LATEX  using  Scientific  Workplace  (MacKichan  Software,  Inc.),  while  Adobe  Illustrator, 
Photoshop,  Origin,  Matlrematica,  and  MUPAD,  were  used  to  prepare  the  illustrations. 

Douglas  Cline, 

University  of  Rochester,  2017 
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Two  dramatically  different  philosophical  approaches  to  science  were  developed  in  the  field  of  classical  me- 
chanics during  the  17th  - 18th  centuries.  This  time  period  coincided  with  the  Age  of  Enlightenment  in  Europe 
during  which  remarkable  intellectual  and  philosophical  developments  occurred.  This  was  a time  when  both 
philosophical  and  causal  arguments  were  equally  acceptable  in  science,  in  contrast  with  current  convention 
where  there  appears  to  be  tacit  agreement  to  discourage  use  of  philosophical  arguments  in  science. 


Snell’s  Law:  The  genesis  of  two  contrasting  philosophical  ap- 
proaches to  science  relates  back  to  early  studies  of  the  reflection 
and  refraction  of  light.  The  velocity  of  light  in  a medium  of  re- 
fractive index  n equals  v = —.  Thus  a light  beam  incident  at  an 
angle  9\,  to  the  normal  of  a plane  interface  between  medium  1 
and  medium  2,  is  refracted  at  an  angle  62  in  medium  2,  where  the 
angles  are  related  by  Snell’s  Law. 


sin  6\  Vi  ri2 
sin  62  V2  n 1 


(Snell’s  Law) 


Ibn  Salrl  of  Bagdad  (984)  first  described  the  refraction  of  light, 
while  Snell  (1621)  derived  his  law  mathematically.  Both  of  these 
scientists  used  the  "vectorial  approach"  where  the  light  velocity  v 
is  considered  to  be  a vector  pointing  in  the  direction  of  propaga- 
tion. 


Fermat’s  Principle:  Fermat’s  principle  of  least  time  (1657), 
which  is  based  on  the  work  of  Hero  of  Alexandria  (~  60)  and  Ibn 
al-Haytham  (1021),  states  that  " light  travels  between  two  given 
points  along  the  path  of  shortest  time " , where  the  transit  time  r 
of  a light  beam  between  two  locations  A and  B,  in  a medium  with 
position-dependent  refractive  index  n(s),  is  given  by 

pts  ^ rB 

r = / dt  = - n(s)ds  (Fermat’s  Principle) 
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Fermat’s  Principle  leads  to  the  derivation  of  Snell’s  Law. 

Philosophically  the  physics  underlying  the  contrasting  vectorial 
and  Fermat’s  Principle  derivations  of  Snell’s  Law  are  dramatically 
different.  The  vectorial  approach  is  based  on  differential  relations 
between  the  velocity  vectors  in  the  two  media,  whereas  Fermat’s 
variational  approach  is  based  on  the  fact  that  the  light  prefer- 
entially selects  a path  for  which  the  integral  of  the  transit  time 
between  the  initial  location  A and  the  final  location  B is  mini- 
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Figure  1:  Vectorial  and  variational  represen- 
tation of  Snell’s  Law  for  refraction  of  light. 


mized.  That  is,  the  first  approach  is  based  on  "vectorial  mechanics"  whereas  Fermat’s  approach  is  based  on 
variational  principles  in  that  the  path  between  the  initial  and  final  locations  is  varied  to  find  the  path  that 
minimizes  the  transit  time.  Fermat’s  enunciation  of  variational  principles  in  physics  played  a key  role  in  the 
historical  development,  and  subsequent  exploitation,  of  the  principle  of  least  action  in  analytical  formulations 
of  classical  mechanics  as  discussed  below. 


xix 


XX 


PROLOGUE 


Newtonian  mechanics:  Momentum  and  force  are  vectors  that  underlie  the  Newtonian  formulation  of 
classical  mechanics.  Newton’s  monumental  treatise,  entitled  " Philosophiae  Naturalis  Principia  Mathemat- 
ica",  published  in  1687,  established  his  three  universal  laws  of  motion,  the  universal  theory  of  gravitation, 
the  derivation  of  Kepler’s  three  laws  of  planetary  motion,  and  the  development  of  calculus.  Newton’s  three 
universal  laws  of  motion  provide  the  most  intuitive  approach  to  classical  mechanics  in  that  they  are  based  on 
vector  quantities  like  momentum,  and  the  rate  of  change  of  momentum,  which  are  related  to  force.  Newton’s 
equation  of  motion 


F 


dp 

dt 


(Newton’s  equation  of  motion) 


is  a vector  differential  relation  between  the  instantaneous  forces  and  rate  of  change  of  momentum,  or  equiva- 
lent instantaneous  accelerations,  all  of  which  are  vector  quantities.  Momentum  and  force  are  easy  to  visualize, 
and  both  cause  and  effect  are  embedded  in  Newtonian  mechanics.  Thus,  if  all  of  the  forces,  including  the 
constraint  forces,  acting  on  the  system  are  known,  then  the  motion  is  solvable  for  two  body  systems.  The 
mathematics  for  handling  Newton’s  "vectorial  mechanics"  approach  to  classical  mechanics  is  well  established. 


Analytical  mechanics:  Variational  principles  underlie  the  analytical  formulation  of  mechanics.  Leibniz, 
who  was  a contemporary  of  Newton,  introduced  methods  based  on  a quantity  called  "vis  viva",  which  is 
Latin  for  "living  force  " and  equals  twice  the  kinetic  energy.  Leibniz  believed  in  the  philosophy  that  God 
created  a perfect  world  where  nature  would  be  thrifty  in  all  its  manifestations.  In  1707,  Leibniz  proposed 
that  the  optimum  path  is  based  on  minimizing  the  time  integral  of  the  vis  viva,  which  is  equivalent  to 
the  action  integral  of  Lagrangian/Hamiltonian  mechanics.  In  1744  Euler  derived  the  Leibniz  result  using 
variational  concepts  while  Maupertuis  restated  the  Leibniz  result  based  on  teleological  arguments.  The 
development  of  Lagrangian  mechanics  culminated  in  the  1788  publication  of  Lagrange’s  monumental  treatise 
entitled  " Mecanique  Analytique  Lagrangian  mechanics  derives  the  magnitude  and  direction  of  the  optimum 
trajectories  and  forces  based  on  the  concept  of  least  action,  which  is  defined  to  be  the  time  integral  of  the 
difference  between  the  kinetic  and  potential  energies.  Hamilton’s  Principle  (1834),  which  underlies  Lagrange’s 
least  action  principle,  minimizes  the  action  integral  S,  given  by 

fB 

S=  L( q,  q ,t)dt  (Hamilton’s  Principle) 

J A 

where  the  Lagrangian  L( q,  q,t)  equals  the  difference  between  the  kinetic  energy  T and  the  potential  energy 
U.  This  Lagrangian  is  a function  of  n generalized  coordinates  qt  plus  their  corresponding  velocities  g,;. 

The  culmination  of  the  development  of  analytical  mechanics  occurred  in  1834  when  Hamilton  developed 
the  premier  variational  approach,  called  Hamiltonian  mechanics,  that  is  based  on  the  Hamiltonian  JL(q,  p,t) 
which  is  a function  of  the  n fundamental  conjugate  position  qi  plus  the  momentum  pi  variables.  In  1843 
Jacobi  provided  the  mathematical  framework  required  to  fully  exploit  the  power  of  Hamiltonian  mechanics. 
Note  that  the  Lagrangian,  Hamiltonian,  and  the  action  integral,  all  are  scalar  quantities  which  simplifies 
derivation  of  the  equations  of  motion  compared  with  the  vector  calculus  used  by  Newtonian  mechanics. 


Philosophical  developments:  Variational  principles  apply  to  all  aspects  of  our  daily  life.  Typical  ex- 
amples include;  selecting  the  optimum  compromise  in  quality  and  cost  when  shopping,  selecting  the  fastest 
route  to  travel  from  home  to  work,  or  selecting  the  optimum  compromise  to  satisfy  the  disparate  desires  of 
the  individuals  comprising  a family.  It  is  astonishing  that  the  laws  of  nature  are  consistent  with  variational 
principles  involving  the  principle  of  least  action.  Minimizing  the  action  integral  led  to  the  development  of  the 
mathematical  field  of  variational  calculus  plus  the  analytical  variational  approaches  to  classical  mechanics 
by  Euler,  Lagrange,  Hamilton,  and  Jacobi. 

The  analytical  approach  to  classical  mechanics  appeared  contradictory  to  Newton’s  intuitive  vector- 
ial treatment  of  force  and  momentum.  There  is  a dramatic  difference  in  philosophy  between  the  vector- 
differential  equations  of  motion  derived  by  Newtonian  mechanics,  which  relate  the  instantaneous  force  to 
the  corresponding  instantaneous  acceleration,  and  analytical  mechanics,  where  minimizing  the  scalar  action 
integral  involves  integrals  over  space  and  time  between  specified  initial  and  final  states.  Analytical  mechanics 
uses  variational  principles  to  determine  the  optimum  trajectory,  from  a continuum  of  tentative  possibilities 
by  requiring  that  the  optimum  trajectory  minimizes  the  action  integral  between  specified  initial  and  final 
conditions. 
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Figure  2:  Chronological  roadmap  of  the  parallel  development  of  the  Newtonian  and  the  variational  approaches 
to  classical  mechanics. 
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PROLOGUE 


Initially  there  was  considerable  prejudice  and  philosophical  opposition  to  use  of  the  variational  approach 
which  is  based  on  the  assumption  that  nature  follows  the  principles  of  economy.  The  variational  approach 
is  not  intuitive,  and  thus  it  was  considered  to  be  speculative  and  "metaphysical",  but  it  was  tolerated  as  an 
efficient  tool  for  exploiting  classical  mechanics.  This  opposition  to  the  variational  principles,  that  underlie 
analytical  mechanics,  delayed  full  appreciation  of  the  variational  approach  until  the  start  of  the  20th  century. 
As  a consequence,  the  intuitive  Newtonian  formulation  reigned  supreme  in  classical  mechanics  for  over  two 
centuries,  even  though  the  remarkable  problem-solving  capabilities  of  analytical  mechanics  were  recognized 
and  exploited  following  development  of  analytical  mechanics. 

The  full  significance  and  superiority  of  the  analytical  variational  formulations  of  classical  mechanics 
became  widely  accepted  following  the  development  of  the  Special  Theory  of  Relativity  in  1905.  The  Theory 
of  Relativity  requires  that  the  laws  of  nature  be  invariant  to  the  reference  frame.  This  is  not  satisfied  by 
the  Newtonian  formulation  of  mechanics  which  assumes  one  absolute  frame  of  reference  and  a separation  of 
space  and  time.  In  contrast,  the  Lagrangian  and  Hamiltonian  formulations  of  the  principle  of  least  action 
remain  valid  in  the  Theory  of  Relativity,  if  the  Lagrangian  is  written  in  a relativistically-invariant  form 
in  space-time.  The  complete  invariance  of  the  variational  approach  to  coordinate  frames  is  precisely  the 
formalism  necessary  for  handling  relativistic  mechanics.  Hamiltonian  mechanics,  which  is  expressed  in  terms 
of  the  conjugate  variables  (q,  p),  relates  classical  mechanics  directly  to  the  underlying  physics  of  quantum 
mechanics  and  quantum  field  theory.  As  a consequence,  the  philosophical  opposition  to  exploiting  variational 
principles  no  longer  exists,  and  Hamiltonian  mechanics  has  become  the  preeminent  formulation  of  modern 
classical  mechanics.  The  reader  is  free  to  draw  their  own  conclusions  regarding  the  philosophical  question 
"is  the  principle  of  economy  a fundamental  law  of  classical  mechanics,  or  is  it  a fortuitous  consequence  of 
the  fundamental  laws  of  nature?" 

From  the  late  seventeenth  century,  until  the  dawn  of  modern  physics  at  the  start  of  the  twentieth  cen- 
tury, classical  mechanics  remained  a primary  driving  force  in  the  development  of  physics.  Classical  mechanics 
embraces  an  unusually  broad  range  of  topics  spanning  motion  of  macroscopic  astronomical  bodies  to  mi- 
croscopic particles  in  nuclear  and  particle  physics,  at  velocities  ranging  from  zero  to  near  the  velocity  of 
light,  from  one-body  to  statistical  many-body  systems,  as  well  as  having  extensions  to  quantum  mechanics. 
Introduction  of  the  Special  Theory  of  Relativity  in  1905,  and  the  General  Theory  of  Relativity  in  1916, 
necessitated  modifications  to  classical  mechanics  for  relativistic  velocities,  and  can  be  considered  to  be  an 
extended  theory  of  classical  mechanics.  Since  the  1920's,  quantal  physics  has  superseded  classical  mechanics 
in  the  microscopic  domain.  Although  quantum  physics  has  played  the  leading  role  in  the  development  of 
physics  during  much  of  the  past  century,  classical  mechanics  still  is  a vibrant  field  of  physics  that  recently 
has  led  to  exciting  developments  associated  with  non-linear  systems  and  chaos  theory.  This  has  spawned 
new  branches  of  physics  and  mathematics  as  well  as  changing  our  notion  of  causality. 

Goals:  The  primary  goal  of  this  book  is  to  introduce  the  reader  to  the  powerful  variational  approaches  that 
play  such  a pivotal  role  in  classical  mechanics,  plus  many  other  branches  of  modern  science  and  engineering. 
Figure  1 gives  a historical  roadmap  of  the  evolution  of  classical  mechanics  from  Newton,  to  the  variational 
approaches  of  Euler,  Lagrange,  Hamilton  and  Jacobi.  This  book  emphasizes  the  intellectual  beauty  of  these 
remarkable  developments  as  well  as  stressing  the  philosophical  implications  that  have  had  a tremendous 
impact  on  modern  science.  A secondary  goal  is  to  apply  variational  principles  to  solve  advanced  applications 
in  classical  mechanics  in  order  to  introduce  many  sophisticated  and  powerful  mathematical  techniques  that 
underlie  much  of  modern  physics. 

The  connections  and  applications  of  classical  mechanics  to  modern  physics  are  emphasized  throughout 
the  book  in  an  effort  to  span  the  chasm  that  divides  the  Newtonian  vector-differential  formulation  and  the 
integral  variational  formulation  of  classical  mechanics,  and  the  corresponding  chasm  that  exists  between 
classical  and  quantum  mechanics.  Note  that  these  variational  principles,  developed  in  the  field  of  classical 
mechanics,  now  are  used  in  a diverse  and  wide  range  of  fields  including  economics,  meteorology,  engineering, 
and  computing. 

This  study  of  classical  mechanics  involves  climbing  a vast  mountain  of  knowledge,  and  the  pathway  to 
the  top  leads  to  elegant  and  beautiful  theories  that  underlie  much  of  modern  physics.  These  theories  are 
applied  to  four  major  topics  in  classical  mechanics.  In  addition,  being  so  close  to  the  summit  provides  the 
opportunity  for  this  book  to  take  a few  extra  steps  beyond  the  normal  undergraduate  classical  mechanics 
syllabus  to  provide  a glimpse  of  the  exciting  physics  found  at  the  summit.  This  new  physics  includes  topics 
such  as  quantum,  relativistic,  and  statistical  mechanics.. 


Chapter  1 


A brief  history  of  classical  mechanics 


1.1  Introduction 

This  chapter  briefly  reviews  the  historical  evolution  of  classical  mechanics  since  considerable  insight  can  be 
gained  from  study  of  the  history  of  science.  There  are  two  dramatically  different  approaches  used  in  classical 
mechanics.  The  first  is  the  vectorial  approach  of  Newton  which  is  based  on  vector  quantities  like  momentum, 
force,  and  acceleration.  The  second  is  the  analytical  approach  of  Lagrange,  Euler,  Hamilton,  and  Jacobi, 
that  is  based  on  the  concept  of  least  action  and  variational  calculus.  The  more  intuitive  Newtonian  picture 
reigned  supreme  in  classical  mechanics  until  the  start  of  the  twentieth  century.  Variational  principles,  which 
were  developed  during  the  nineteenth  century,  never  aroused  much  enthusiasm  in  scientific  circles  due  to 
philosophical  objections  to  the  underlying  concepts;  this  approach  was  merely  tolerated  as  an  efficient  tool 
for  exploiting  classical  mechanics.  A dramatic  advance  in  the  philosophy  of  scientific  thinking  occurred  at 
the  start  of  the  20th  century  leading  to  widespread  acceptance  of  the  superiority  of  variational  principles. 


1.2  Prehistoric  astronomy 

Astronomy  is  the  earliest  branch  of  classical  mechanics.  Astronomical  observatories  date  back  to  around 
4900BC  when  wooden  solar  observatories,  called  henges,  were  built  in  Europe.  Stonehenge  in  England 
is  a well-known  example  which  was  built  ~ 30005(7.  The  mesopotamian  people,  who  lived  in  the  land 
between  the  Tigress  and  Euphrates  rivers,  developed  cuneiform  writing  and  recorded  accurate  numerical 
data  around  3500  — 30005(7.  They  recognized  that  the  motion  of  the  planets  was  periodic  as  reported  in 
Babylonian  tablets.  After  2700BC  the  Egyptians  built  pyramids  that  are  aligned  to  the  pole  star  and  they 
made  significant  advances  in  astronomy,  mathematics  and  medicine. 


1.3  Greek  antiquity 

The  great  philosophers  in  ancient  Greece  played  a key  role  by  using  the  astronomical  work  of  the  Babylonians 
to  develop  scientific  theories  of  mechanics.  Thales  of  Miletus  (624  - 547BC),  the  first  of  the  seven 
great  greek  philosophers,  developed  geometry  and  is  hailed  as  the  first  true  mathematician.  Pythagorus 
(570  - 495BC)  developed  mathematics  and  postulated  that  the  earth  is  spherical.  Democritus  (460  - 
370BC)  has  been  called  the  father  of  modern  science,  while  Socrates  (469  - 399BC)  is  renowned  for  his 
contributions  to  ethics.  Plato  (427-347  B.C.)  who  was  a mathematician  and  student  of  Socrates,  wrote 
important  philosophical  dialogues.  He  founded  the  Academy  in  Athens  which  was  the  first  institution  of 
higher  learning  in  the  Western  world  that  helped  lay  the  foundations  of  Western  philosophy  and  science. 
Aristotle  (384-322  B.C.)  is  an  important  founder  of  Western  philosophy  encompassing  ethics,  logic, 
science,  and  politics.  His  views  on  the  physical  sciences  profoundly  influenced  medieval  scholarship  that 
extended  well  into  the  Renaissance.  He  presented  the  first  implied  formulation  of  the  principle  of  virtual 
work  in  statics  and  his  statement  that  "what  is  lost  in  velocity  is  gained  in  force"  is  a veiled  reference  to 
kinetic  and  potential  energy.  He  adopted  an  Earth  centered  model  of  the  universe.  Aristarchus  (310  - 240 
B.C.)  argued  that  the  Earth  orbited  the  Sun  and  used  measurements  to  imply  the  relative  distances  of  the 
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Moon  and  the  Sun.  The  greek  philosophers  were  relatively  advanced  in  logic  and  mathematics  and  developed 
concepts  that  enabled  them  to  calculate  areas  and  perimeters.  Unfortunately  their  philosophical  approach 
neglected  collecting  quantitative  and  systematic  data  that  is  an  essential  ingredient  to  the  advancement  of 
science. 

Archimedes  (287-212  B.C.)  represented  the  culmination  of  science  in  ancient  Greece.  As  an  engineer 
he  designed  machines  of  war  while  as  a scientist  he  made  significant  contributions  to  hydrostatics  and  the 
principle  of  the  lever.  As  a mathematician  he  applied  infinitessimals  in  a way  that  is  reminiscent  of  modern 
integral  calculus  which  he  used  to  derive  a value  for  ir.  Unfortunately  much  of  the  work  of  the  brilliant 
Archimedes  subsequently  fell  into  oblivion.  Hero  of  Alexandria  (10  - 70  A.D.)  described  the  principle 
of  reflection  that  light  takes  the  shortest  path.  This  is  an  early  illustration  of  variational  principle  of 
least  time.  Ptolemy  (83  - 161  A.D.)  wrote  several  scientific  treatises  that  greatly  influenced  subsequent 
philosophers.  Unfortunately  he  adopted  the  incorrect  geocentric  solar  system  in  contrast  to  the  heliocentric 
model  of  Aristarchus  and  others. 


1.4  Middle  Ages 

The  decline  and  fall  of  the  Roman  Empire  in  ~410  A.D.  marks  the  end  of  Classical  Antiquity  and  the 
beginning  of  the  Dark  Ages  in  Western  Europe  (Christendom)  while  the  Muslim  scholars  in  Eastern  Europe 
continued  to  make  progress  in  astronomy  and  mathematics.  For  example,  in  Egypt,  Alhazen  (965  - 1040 
A.D.)  expanded  the  principle  of  least  time  to  reflection  and  refraction.  The  Dark  Ages  involved  a long 
scientific  decline  in  Western  Europe  that  languished  for  about  900  years.  Science  was  dominated  by  religious 
dogma,  all  western  scholars  were  monks,  and  the  important  scientific  achievements  of  Greek  antiquity  were 
forgotten.  The  works  of  Aristotle  were  reintroduced  to  Western  Europe  by  Arabs  in  the  early  13*/l  century 
leading  to  the  concepts  of  forces  in  static  systems  which  were  developed  during  the  fourteenth  century. 

This  included  concepts  of  the  work  done  by  a force,  and  the  virtual  work  involved  in  virtual  displacements. 
Leonardo  da  Vinci  (1452-1519)  was  a leader  in  mechanics  at  that  time.  He  made  seminal  contributions 
to  science,  in  addition  to  his  well  known  contributions  to  architecture,  engineering,  sculpture,  and  art. 

Nicolaus  Copernicus  (1473-1543)  rejected  the  geocentric  theory  of  Ptolomy  and  formulated  a scientifically- 
based  heliocentric  cosmology  that  displaced  the  Earth  from  the  center  of  the  universe.  The  Ptolomic  view 
was  that  heaven  represented  the  perfect  unchanging  divine  while  the  earth  represented  change  plus  chaos 
and  the  celestial  bodies  moved  relative  to  the  fixed  heavens.  The  book,  " De  revolutionibus  orbium  coelestium 
"(On  the  Revolutions  of  the  Celestial  Spheres),  published  by  Copernicus  in  1543,  is  regarded  as  the  starting 
point  of  modern  astronomy  and  the  defining  epiphany  that  began  the  Scientific  Revolution.  The  book  "De 
Magnete " written  in  1600  by  the  English  physician  William  Gilbert  (1540-1603)  presented  the  results  of 
well-planned  studies  of  magnetism  and  strongly  influenced  the  intellectual-scientific  evolution  at  that  time. 

Johannes  Kepler  (1571-1630),  a German  mathematician,  astronomer  and  astrologer,  was  a key 
figure  in  the  17th  century  Scientific  Revolution.  He  is  best  known  for  recognizing  the  connection  between  the 
motions  in  the  sky  and  physics.  His  laws  of  planetary  motion  were  developed  by  later  astronomers  based  on 
his  written  work  " Astronomia  nova" , " Harmonices  Mundi",  and  "Epitome  of  Copernican  Astrononomy" . 
Kepler  was  an  assistant  to  Tycho  Brahe  (1546-1601)  who  for  many  years  recorded  accurate  astronomical 
data  that  played  a key  role  in  the  development  of  Kepler’s  theory  of  planetary  motion.  Kepler’s  work 
provided  the  foundation  for  Isaac  Newton’s  theory  of  universal  gravitation.  Unfortunately  Kepler  did  not 
recognize  the  true  nature  of  the  gravitational  force. 

Galileo  Galilei  (1564-1642)  built  on  the  Aristotle  principle  by  recognizing  the  law  of  inertia,  the 
persistence  of  motion  if  no  forces  act,  and  the  proportionality  between  force  and  acceleration.  This  amounts 
to  recognition  of  work  as  the  product  of  force  times  displacement  in  the  direction  of  the  force.  He  applied 
virtual  work  to  the  equilibrium  of  a body  on  an  inclined  plane.  He  also  showed  that  the  same  principle 
applies  to  hydrostatic  pressure  that  had  been  established  by  Archimedes,  but  he  did  not  apply  his  concepts 
in  classical  mechanics  to  the  considerable  knowledge  base  on  planetary  motion.  Galileo  is  famous  for  the 
apocryphal  story  that  he  dropped  two  cannon  balls  of  different  masses  from  the  Tower  of  Pisa  to  demonstrate 
that  their  speed  of  descent  was  independent  of  their  mass. 
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1.5  Age  of  Enlightenment 

The  Age  of  Enlightenment  is  a term  used  to  describe  a phase  in  Western  philosophy  and  cultural  life  in 
which  reason  was  advocated  as  the  primary  source  and  legitimacy  for  authority.  It  developed  simultaneously 
in  Germany,  France,  Britain,  the  Netherlands,  and  Italy  around  the  1650’s  and  lasted  until  the  French 
Revolution  in  1789.  The  intellectual  and  philosophical  developments  led  to  moral,  social,  and  political 
reforms.  The  principles  of  individual  rights,  reason,  common  sense,  and  deism  were  a revolutionary  departure 
from  the  existing  theocracy,  autocracy,  oligarchy,  aristocracy,  and  the  divine  right  of  kings.  It  led  to  political 
revolutions  in  France  and  the  United  States.  It  marks  a dramatic  departure  from  the  Early  Modern  period 
which  was  noted  for  religious  authority,  absolute  state  power,  guild-based  economic  systems,  and  censorship  of 
ideas.  It  opened  a new  era  of  rational  discourse,  liberalism,  freedom  of  expression,  and  scientific  method.  This 
new  environment  led  to  tremendous  advances  in  both  science  and  mathematics  in  addition  to  music  (Johann 
Sebastian  Bach,  Mozart),  literature  (Goethe),  philosophy  (Spinoza,  Kant)  and  art  (Rubens).  Scientific 
development  during  the  17t/l  century  included  the  pivotal  advances  made  by  Newton  and  Leibniz  at  the 
beginning  of  the  revolutionary  Age  of  Enlightenment,  culminating  in  the  development  of  variational  calculus 
and  analytical  mechanics  by  Euler  and  Lagrange.  The  scientific  advances  of  this  age  include  publication  of 
two  monumental  books  " Philosophiae  Naturalis  Principia  Mathematica"  by  Newton  in  1687  and  Mecanique 
analytique  by  Lagrange  in  1788.  These  are  the  definitive  two  books  upon  which  classical  mechanics  is  built. 

Rene  Descartes  (1596-1650)  attempted  to  formulate  the  laws  of  motion  in  1644.  He  talked  about 
conservation  of  motion  (momentum)  in  a straight  line  but  did  not  recognize  the  vector  character  of  momen- 
tum. Pierre  de  Fermat  (1601-1665)  and  Rene  Descartes  were  two  leading  mathematicians  in  the  first 
half  of  the  17th  century.  Independently  they  discovered  the  principles  of  analytic  geometry  and  developed 
some  initial  concepts  of  calculus.  Fermat  and  Blaise  Pascal  (1623-1662)  were  the  founders  of  the  theory 
of  probability.  Fermat  revived  the  principle  of  least  time,  which  states  that  " light  travels  between  two  given 
points  along  the  path  of  shortest  time"  and  was  used  to  derive  Snell’s  law  in  1657.  This  enunciation  of  vari- 
ational principles  in  physics  played  a key  role  in  the  historical  development  of  the  principle  of  least  action 
that  underlies  the  analytical  formulations  of  classical  mechanics. 

Isaac  Newton  (1642-1727)  made  pioneering  contributions  to  physics  and  mathematics  as  well  as 
being  a theologian.  At  18  he  was  admitted  to  Trinity  College  Cambridge  where  he  read  the  writings  of 
modern  philosophers  like  Descartes,  and  astronomers  like  Copernicus,  Galileo,  and  Kepler.  By  1665  he 
had  discovered  the  generalized  binomial  theorem,  and  began  developing  infmitessimal  calculus.  Due  to  a 
plague,  the  university  closed  for  two  years  in  1665  during  which  Newton  worked  at  home  developing  the 
theory  of  calculus  that  built  upon  the  earlier  work  of  Barrow  and  Descartes.  He  was  elected  Lucasian 
Professor  of  Mathematics  in  1669  at  the  age  of  26.  From  1670  Newton  focussed  on  optics  leading  to  his 
" Hypothesis  of  Light"  published  in  1675  and  his  book  " Opticks " in  1704.  Newton  described  light  as  being 
made  up  of  a flow  of  extremely  subtle  corpuscles  that  also  had  associated  wavelike  properties  to  explain 
diffraction  and  optical  interference  that  he  studied.  Newton  returned  to  mechanics  in  1677  by  studying 
planetary  motion  and  gravitation  that  applied  the  calculus  he  had  developed.  In  1687  he  published  his 
monumental  treatise  entitled  "Philosophiae  Naturalis  Principia  Mathematica"  which  established  his  three 
universal  laws  of  motion,  the  universal  theory  of  gravitation,  derivation  of  Kepler’s  three  laws  of  planetary 
motion,  and  was  his  first  publication  of  the  development  of  calculus  which  he  called  "the  science  of  fluxions". 
Newton’s  laws  of  motion  are  based  on  the  concepts  of  force  and  momentum,  that  is,  force  equals  the  rate  of 
change  of  momentum.  Newton’s  postulate  of  an  invisible  force  able  to  act  over  vast  distances  led  him  to  be 
criticized  for  introducing  "occult  agencies"  into  science.  In  a remarkable  achievement,  Newton  completely 
solved  the  laws  of  mechanics.  His  theory  of  classical  mechanics  and  of  gravitation  reigned  supreme  until  the 
development  of  the  Theory  of  Relativity  in  1905.  The  followers  of  Newton  envisioned  the  Newtonian  laws 
to  be  absolute  and  universal.  This  dogmatic  reverence  of  Newtonian  mechanics  prevented  physicists  from 
an  unprejudiced  appreciation  of  the  analytic  variational  approach  to  mechanics  developed  during  the  17th 
through  19th  centuries.  Newton  was  the  first  scientist  to  be  knighted  and  was  appointed  president  of  the 
Royal  Society.  Newton  had  an  unpleasant  character  and  was  notorious  for  the  heated  disputes  he  provoked 
with  other  academics.  Eventually  he  left  academia  and  became  active  in  politics.  This  led  to  his  appointment 
as  Warden  of  the  Royal  Mint  where  he  conducted  a major  campaign  against  counterfeiting  that  sent  several 
men  to  their  death  on  the  gallows. 
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Gottfried  Leibniz  (1646-1716)  was  a brilliant  German  philosopher,  a contemporary  of  Newton,  who 
worked  on  both  calculus  and  mechanics.  Leibniz  started  development  of  calculus  in  1675,  ten  years  after 
Newton,  but  Leibniz  published  his  work  in  1684,  which  was  three  years  before  Newton’s  Principia.  Leibniz 
made  significant  contributions  to  integral  calculus  and  was  responsible  for  the  calculus  notation  currently 
used.  He  introduced  the  name  calculus  based  on  the  Latin  word  for  the  small  stone  used  for  counting. 
Newton  and  Leibniz  were  involved  in  a protracted  argument  over  who  originated  calculus.  It  appears  that 
Leibniz  saw  drafts  of  Newton’s  work  on  calculus  during  a visit  to  England.  Throughout  their  argument 
Newton  was  the  ghost  writer  of  most  of  the  articles  in  support  of  himself  and  he  had  them  published  under 
non-de-plume  of  his  friends.  Leibniz  made  the  tactical  error  of  appealing  to  the  Royal  Society  to  intercede  on 
his  behalf.  Newton,  as  president  of  the  Royal  Society,  appointed  his  friends  to  an  "impartial  " committee  to 
investigate  this  issue,  then  he  wrote  the  committee’s  report  that  accused  Leibniz  of  plagiarism  of  Newton’s 
work  on  calculus,  after  which  he  had  it  published  by  the  Royal  Society.  Still  unsatisfied  he  then  wrote  an 
anonymous  review  of  the  report  in  the  Royal  Society’s  own  periodical.  This  bitter  dispute  lasted  until  the 
death  of  Leibniz.  When  Leibniz  died  his  work  was  largely  discredited.  The  fact  that  he  falsely  claimed  to  be 
a nobleman  and  added  the  prefix  von  to  his  name,  coupled  with  Newton’s  vitriolic  attacks,  did  not  help  his 
credibility.  Newton  is  reported  to  have  declared  that  he  took  great  satisfaction  in  "breaking  Leibniz’s  heart." 
Studies  during  the  20th  century  have  largely  revived  the  reputation  of  Leibniz  and  he  is  acknowledged  to 
have  made  major  contributions  to  the  development  of  calculus. 

Leibniz  made  significant  contributions  to  classical  mechanics.  In  contrast  to  Newton’s  laws  of  motion, 
which  are  based  on  the  concept  of  momentum,  Leibniz  devised  a new  theory  of  dynamics  based  on  kinetic 
and  potential  energy  that  anticipates  the  analytical  variational  approach  of  Lagrange  and  Hamilton.  Leibniz 
argued  for  a quantity  called  the  " vis  viva ",  which  is  Latin  for  "living  force " that  equals  twice  the  kinetic 
energy.  Leibniz  argued  that  the  change  in  kinetic  energy  is  equal  to  the  work  done.  In  1687  Leibniz 
proposed  that  the  optimum  path  is  based  on  minimizing  the  time  integral  of  the  vis  viva  which  is  equivalent 
to  the  action  integral.  Leibniz  used  both  philosophical  and  causal  arguments  in  his  work  which  were  equally 
acceptable  during  the  Age  of  Enlightenment.  Unfortunately  for  Leibniz,  his  analytical  approach  based  on 
energies,  which  are  scalars,  appeared  contradictory  to  Newton’s  intuitive  vectorial  treatment  of  force  and 
momentum.  There  was  considerable  prejudice  and  philosophical  opposition  to  the  variational  approach  which 
assumes  that  nature  is  thrifty  in  all  of  its  actions.  The  variational  approach  was  considered  to  be  speculative 
and  "metaphysical"  in  contrast  to  the  causal  arguments  supporting  Newtonian  mechanics.  This  opposition 
delayed  full  appreciation  of  the  variational  approach  until  the  start  of  the  20th  century. 

Johann  Bernoulli  (1667-1748)  was  a Swiss  mathematician  who  was  a student  of  Leibniz’s  calculus,  and 
sided  with  Leibniz  in  the  Newton-Leibniz  dispute  over  the  credit  for  developing  calculus.  Also  Bernoulli  sided 
with  the  Descartes’  vortex  theory  of  gravitation  which  delayed  acceptance  of  Newton’s  theory  of  gravitation 
in  Europe.  Bernoulli  pioneered  development  of  the  calculus  of  variations  by  solving  the  problems  of  the 
catenary,  the  brachistochrone,  and  Fermat’s  principle.  The  Bernoulli  family  is  famous  for  its  contributions 
to  mathematics  and  science;  Johann’s  son  Daniel  played  a significant  role  in  the  development  of  the  well- 
known  Bernoulli  Principle  in  hydrodynamics. 

Pierre  Louis  Maupertuis  (1698-1759)  was  a student  of  Johann  Bernoulli  and  conceived  the  universal 
hypothesis  that  in  nature  there  is  a certain  quantity  called  action  which  is  minimized.  Although  this  bold 
assumption  correctly  anticipates  the  development  of  the  variational  approach  to  classical  mechanics,  he 
obtained  his  hypothesis  by  an  entirely  incorrect  method.  He  was  a dilettante  whose  mathematical  prowess 
was  far  behind  the  high  standards  of  that  time,  and  he  could  not  establish  satisfactorily  the  quantity  to  be 
minimized.  His  teleological1  argument  was  influenced  by  Fermat’s  principle  and  the  corpuscle  theory  of  light 
that  implied  a close  connection  between  optics  and  mechanics. 

Leonhard  Euler  (1707-1783)  was  the  preeminent  Swiss  mathematician  of  the  18t/l  century  and  was 
a student  of  Johann  Bernoulli.  Euler  developed,  with  full  mathematical  rigor,  the  calculus  of  variations 
following  in  the  footsteps  of  Johann  Bernoulli.  Euler  used  variational  calculus  to  solve  minimum/maximum 
isoperimetric  problems  which  had  attracted  and  challenged  the  early  developers  of  calculus,  Newton,  Leibniz, 
and  Bernoulli.  Euler  also  was  the  first  to  solve  the  rigid-body  rotation  problem  using  the  three  components 
of  the  angular  velocity  as  kinematical  variables.  Euler  became  blind  in  both  eyes  by  1766  but  that  did  not 
hinder  his  prolific  output  in  mathematics  due  to  his  remarkable  memory  and  mental  capabilities.  Euler’s 
contributions  to  mathematics  are  remarkable  in  quality  and  quantity;  for  example  during  1775  he  published 

1 Teleology  is  any  philosophical  account  that  holds  that  final  causes  exist  in  nature,  meaning  that  — analogous  to  purposes 
found  in  human  actions  — nature  inherently  tends  toward  definite  ends. 
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one  mathematical  paper  per  week  in  spite  of  being  blind.  Euler  implicitly  implied  the  principle  of  least 
action  using  vis  visa  which  is  not  the  exact  form  explicitly  developed  by  Lagrange. 

Jean  le  Rond  d’Alembert  (1717-1785)  was  a French  mathematician  and  physicist  who  had  the 
clever  idea  of  extending  use  of  the  principle  of  virtual  work  from  statics  to  dynamics.  D’Alembert’s  Principle 
rewrites  the  principle  of  virtual  work  in  the  form 

N 

- pi)5ri  = 0 

i=  1 

where  the  inertial  reaction  force  p is  subtracted  from  the  corresponding  force  F.  This  extension  of  the 
principle  of  virtual  work  applies  equally  to  both  statics  and  dynamics  leading  to  a single  variational  principle. 

Joseph  Louis  Lagrange  (1736-1813)  was  an  Italian  mathematician  who  was  a student  of  Leonhard 
Euler  and  his  work  paralleled  that  of  Euler.  In  1788  Lagrange  published  his  monumental  treatise  on  ana- 
lytical mechanics  entitled  " Mecanique  Analytique"  which  describes  his  new,  immensely  powerful,  analytical 
technique  that  can  solve  any  mechanical  problem  without  resort  to  geometrical  considerations.  His  theory 
only  required  the  analytical  form  of  the  scalar  quantities  kinetic  and  potential  energy.  In  the  preface  of 
his  book  he  refers  modestly  to  his  extraordinary  achievements  with  the  statement  "The  reader  will  find  no 
figures  in  the  work.  The  methods  which  I set  forth  do  not  require  either  constructions  or  geometrical  or 
mechanical  reasonings:  but  only  algebraic  operations,  subject  to  a regular  and  uniform  rule  of  procedure." 
Lagrange  also  introduced  the  concept  of  undetermined  multipliers  to  handle  auxiliary  conditions  which  plays 
a vital  part  of  theoretical  mechanics.  William  Hamilton,  an  outstanding  figure  in  the  analytical  formulation 
of  classical  mechanics,  called  Lagrange  the  "Shakespeare  of  mathematics,"  on  account  of  the  extraordinary 
beauty,  elegance,  and  depth  of  the  Lagrangian  methods.  Lagrange  also  pioneered  numerous  significant 
contributions  to  mathematics.  For  example,  Euler,  Lagrange,  and  d’Alembert  developed  much  of  the  math- 
ematics of  partial  differential  equations.  Lagrange  survived  the  French  Revolution  and,  in  spite  of  being  a 
foreigner,  Napoleon  named  Lagrange  to  the  Legion  of  Honour  and  made  him  a Count  of  the  Empire  in  1808. 
Lagrange  was  honoured  by  being  buried  in  the  Pantheon. 

Jean  Baptiste  Joseph  Fourier  (1768-1830)  was  a French  mathematician  and  physicist  who  was  a 
student  of  Lagrange.  Fourier  is  most  famous  for  the  development  of  Fourier  analysis  which  includes  Fourier 
series,  and  Fourier  transforms.  His  work  has  many  applications  to  classical  mechanics  such  as  all  forms  of 
wave  motion,  signal  processing,  and  solving  for  the  eigenfunctions  of  linear  equations. 

1.6  19th  century 

The  zenith  in  development  of  the  variational  approach  to  classical  mechanics  occurred  during  the  19th  century 
primarily  due  to  the  work  of  Hamilton  and  Jacobi. 

Carl  Friedrich  Gauss  (1777-1855)  was  a German  child  prodigy  who  made  many  significant  contri- 
butions to  mathematics,  astronomy  and  physics.  He  did  not  work  directly  on  the  variational  approach,  but 
Gauss’s  law,  the  divergence  theorem,  and  the  Gaussian  statistical  distribution  are  important  examples  of 
concepts  that  he  developed  and  which  feature  prominently  in  classical  mechanics  as  well  as  other  branches 
of  physics,  and  mathematics. 

Simeon  Poisson  (1781-1840),  was  a brilliant  mathematician  who  was  a student  of  Lagrange.  He 
developed  the  Poisson  statistical  distribution  as  well  as  the  Poisson  equation  that  features  prominently  in 
electromagnetic  and  other  field  theories.  His  major  contribution  to  classical  mechanics  is  development,  in 
1809,  of  the  Poisson  bracket  formalism  which  featured  prominently  in  development  of  Hamiltonian  mechanics 
and  quantum  mechanics. 

William  Hamilton  (1805-1865)  was  a brilliant  Irish  physicist,  astronomer  and  mathematician  who  was 
appointed  professor  of  astronomy  at  Dublin  when  he  was  barely  22  years  old.  He  developed  the  Hamiltonian 
mechanics  formalism  of  classical  mechanics  which  now  plays  a pivotal  role  in  modern  classical  and  quantum 
mechanics.  He  opened  an  entirely  new  world  beyond  the  developments  of  Lagrange.  Whereas  the  Lagrange 
equations  of  motion  are  complicated  second-order  differential  equations,  Hamilton  succeeded  in  transforming 
them  into  a set  of  first-order  differential  equations  with  twice  as  many  variables  that  consider  momenta  and 
the  conjugate  positions  as  independent  variables.  The  differential  equations  of  Hamilton  are  linear,  have 
separated  derivatives,  and  represent  the  simplest  and  most  desirable  form  possible  for  differential  equations  to 
be  used  in  a variational  approach.  Hence  the  name  "canonical  variables"  given  by  Jacobi.  Hamilton  exploited 


6 


CHAPTER  1.  A BRIEF  HISTORY  OF  CLASSICAL  MECHANICS 


the  d’Alembert  principle  to  give  the  first  exact  formulation  of  the  principle  of  least  action  which  underlies  the 
variational  principles  used  in  analytical  mechanics.  The  form  derived  by  Euler  and  Lagrange  employed  the 
principle  in  a way  that  applies  only  for  conservative  (sclerononric)  cases.  A significant  discovery  of  Hamilton 
is  his  realization  that  classical  mechanics  and  geometrical  optics  can  be  handled  from  one  unified  viewpoint. 
In  both  cases  he  uses  a "characteristic"  function  that  has  the  property  that,  by  mere  differentiation,  the 
path  of  the  body,  or  light  ray,  can  be  determined  by  the  same  partial  differential  equations.  This  solution  is 
equivalent  to  the  solution  of  the  equations  of  motion. 

Carl  Gustave  Jacob  Jacobi  (1804-1851),  a Prussian  mathematician  and  contemporary  of  Hamilton, 
significantly  developed  Hamiltonian  mechanics.  He  was  one  of  the  few  who  immediately  recognized  the 
extraordinary  importance  of  the  Hamiltonian  formulation  of  mechanics.  Jacobi  developed  canonical  trans- 
formation theory  and  showed  that  the  function,  used  by  Hamilton,  is  only  one  special  case  of  functions  that 
generate  suitable  canonical  transformations.  He  proved  that  any  complete  solution  of  the  partial  differen- 
tial equation,  without  the  specific  boundary  conditions  applied  by  Hamilton,  is  sufficient  for  the  complete 
integration  of  the  equations  of  motion.  This  greatly  extends  the  usefulness  of  Hamilton’s  partial  differential 
equations.  In  1843  Jacobi  developed  both  the  Poisson  brackets,  and  the  Hamilton-Jacobi,  formulations  of 
Hamiltonian  mechanics.  The  latter  gives  a single,  first-order  partial  differential  equation  for  the  action  func- 
tion in  terms  of  the  n generalized  coordinates  which  greatly  simplifies  solution  of  the  equations  of  motion. 
He  also  derived  a principle  of  least  action  for  time-independent  cases  which  had  been  studied  by  Euler  and 
Lagrange.  Jacobi  developed  a superior  approach  to  the  variational  integral  that,  by  eliminating  time  from 
the  integral,  determined  the  path  without  saying  anything  about  how  the  motion  occurs  in  time. 

James  Clerk  Maxwell  (1831-1879)  was  a Scottish  theoretical  physicist  and  mathematician.  His 
most  prominent  achievement  was  formulating  a classical  electromagnetic  theory  that  united  all  previously 
unrelated  observations,  experiments  and  equations  of  electricity,  magnetism  and  optics  into  one  consistent 
theory.  Maxwell’s  equations  demonstrated  that  electricity,  magnetism  and  light  are  all  manifestations  of  the 
same  phenomenon,  namely  the  electromagnetic  field.  Consequently,  all  other  classic  laws  and  equations  of 
electromagnetism  were  simplified  cases  of  Maxwell’s  equations.  Maxwell’s  achievements  concerning  electro- 
magnetism have  been  called  the  "second  great  unification  in  physics".  Maxwell  demonstrated  that  electric 
and  magnetic  fields  travel  through  space  in  the  form  of  waves,  and  at  the  constant  speed  of  light.  In  1864 
Maxwell  wrote  "A  Dynamical  Theory  of  the  Electromagnetic  Field"  which  proposed  that  light  was  in  fact 
undulations  in  the  same  medium  that  is  the  cause  of  electric  and  magnetic  phenomena.  His  work  in  produc- 
ing a unified  model  of  electromagnetism  is  one  of  the  greatest  advances  in  physics.  Maxwell,  in  collaboration 
with  Ludwig  Boltzmann  (1844-1906),  also  helped  develop  the  Maxwell-Boltzmann  distribution,  which  is 
a statistical  means  of  describing  aspects  of  the  kinetic  theory  of  gases.  These  two  discoveries  helped  usher  in 
the  era  of  modern  physics,  laying  the  foundation  for  such  fields  as  special  relativity  and  quantum  mechanics. 
Boltzmann  founded  the  field  of  statistical  mechanics  and  was  an  early  staunch  advocate  of  the  existence  of 
atoms  and  molecules. 

Henri  Poincare  (1854-1912)  was  a French  theoretical  physicist  and  mathematician.  He  was  the  first  to 
present  the  Lorentz  transformations  in  their  modern  symmetric  form  and  discovered  the  remaining  relativistic 
velocity  transformations.  Although  there  is  similarity  to  Einstein’s  Special  Theory  of  Relativity,  Poincare  and 
Lorentz  still  believed  in  the  concept  of  the  ether  and  did  not  fully  comprehend  the  revolutionary  philosophical 
change  implied  by  Einstein.  Poincare  worked  on  the  solution  of  the  three-body  problem  in  planetary  motion 
and  was  the  first  to  discover  a chaotic  deterministic  system  which  laid  the  foundations  of  modern  chaos 
theory.  It  rejected  the  long-held  deterministic  view  that  if  the  position  and  velocities  of  all  the  particles  are 
known  at  one  time,  then  it  is  possible  to  predict  the  future  for  all  time. 

The  last  two  decades  of  the  19t/l  century  saw  the  culmination  of  classical  physics  and  several  important 
discoveries  that  led  to  a revolution  in  science  that  toppled  classical  physics  from  its  throne.  The  end  of  the 
19th  century  was  a time  during  which  tremendous  technological  progress  occurred,  flight,  the  automobile, 
and  turbine-powered  ships  were  developed,  Niagara  Falls  was  harnessed  for  power,  etc.  During  this  period, 
Heinrich  Hertz  (1857-1894)  produced  electromagnetic  waves  confirming  their  derivation  using  Maxwell’s 
equations  as  well  as  simultaneously  discovering  the  photoelectric  effect.  Technical  developments,  such  as 
photography,  the  induction  spark  coil,  and  the  vacuum  pump  played  a significant  role  in  scientific  discoveries 
made  during  the  1890’s.  At  the  end  of  the  19th  century,  scientists  thought  that  the  basic  laws  were  understood 
and  worried  that  future  physics  would  be  in  the  fifth  decimal  place;  some  scientists  worried  that  little  was 
left  for  them  to  discover.  However,  there  remained  a few,  presumed  minor,  unexplained  discrepancies  plus 
new  discoveries  that  led  to  the  revolution  in  science  that  occurred  at  the  beginning  of  the  29th  century. 
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1.7  The  20th  century  revolution  in  physics 

The  two  greatest  achievements  of  modern  physics  occurred  in  the  beginning  of  the  20th  century.  The  first 
was  Einstein’s  development  of  the  Theory  of  Relativity;  the  Special  Theory  of  Relativity  in  1905  and  the 
General  Theory  of  Relativity  in  1915.  This  was  followed  in  1925  by  the  development  of  quantum  mechanics. 

Albert  Einstein  (1879-1955)  developed  the  Special  Theory  of  Relativity  in  1905  and  the  General  The- 
ory of  Relativity  in  1915;  both  of  these  revolutionary  theories  had  a profound  impact  on  classical  mechanics 
and  the  underlying  philosophy  of  physics.  The  Newtonian  formulation  of  mechanics  was  shown  to  be  an 
approximation  that  applies  only  at  low  velocities  while  the  General  Theory  of  Relativity  superseded  New- 
ton’s Law  of  Gravitation  and  explained  the  Equivalence  Principle.  The  Newtonian  concepts  of  an  absolute 
frame  of  reference,  plus  the  assumption  of  the  separation  of  time  and  space  were  shown  to  be  invalid  at 
relativistic  velocities.  Einstein’s  postulate  that  the  laws  of  physics  are  the  same  in  all  inertial  frames  requires 
a revolutionary  change  in  the  philosophy  of  time,  space  and  reference  frames  which  leads  to  a breakdown 
in  the  Newtonian  formalism  of  classical  mechanics.  By  contrast,  the  Lagrange  and  Hamiltonian  variational 
formalisms  of  mechanics,  plus  the  principle  of  least  action,  remain  intact  using  a relativistically  invariant 
Lagrangian.  The  independence  of  the  variational  approach  to  reference  frames  is  precisely  the  formalism 
necessary  for  relativistic  mechanics.  The  invariance  to  coordinate  frames  of  the  basic  field  equations  also 
must  remain  invariant  for  the  General  Theory  of  Relativity.  Thus  the  development  of  the  Theory  of  Rela- 
tivity unambiguously  demonstrated  the  superiority  of  the  variational  formulation  of  classical  mechanics  over 
the  vectorial  Newtonian  formulation,  and  thus  the  considerable  effort  made  by  Euler,  Lagrange,  Hamilton, 
Jacobi,  and  others  in  developing  the  analytical  variational  formalism  of  classical  mechanics  finally  came  to 
fruition  at  the  start  of  the  20t/l  century.  Newton’s  two  crowning  achievements,  the  Laws  of  Motion  and  the 
Laws  of  Gravitation,  that  had  reigned  supreme  since  published  in  the  Principia  in  1687,  were  toppled  from 
the  throne  by  Einstein. 

Emmy  Noether  (1882-1935)  has  been  described  as  "the  greatest  ever  woman  mathematician".  In 
1915  she  proposed  a theorem  that  a conservation  law  is  associated  with  any  differentiable  symmetry  of  a 
physical  system.  Noether’s  theorem  evolves  naturally  from  Lagrangian  and  Hamiltonian  mechanics  and 
she  applied  it  to  the  four-dimensional  world  of  general  relativity.  Noether’s  theorem  has  had  an  important 
impact  in  guiding  the  development  of  modern  physics. 

Another  profound  development  that  had  a revolutionary  impact  on  classical  mechanics  was  quantum 
physics  plus  quantum  field  theory.  The  1913  model  of  atomic  structure  by  Niels  Bohr  (1885-1962)  and 
the  subsequent  enhancements  by  Arnold  Sommerfeld  (1868-1951),  were  based  completely  on  classical 
Hamiltonian  mechanics.  The  proposal  of  wave-particle  duality  by  Louis  de  Broglie  (1892-1987),  made 
in  his  1924  thesis,  was  the  catalyst  leading  to  the  development  of  quantum  mechanics.  In  1925  Werner 
Heisenberg  (1901-1976),  and  Max  Born  (1882-1970)  developed  a matrix  representation  of  quantum 
mechanics  using  non-commuting  conjugate  position  and  momenta  variables. 

Paul  Dirac  (1902-1984)  showed  in  his  Ph.D.  thesis  that  Heisenberg’s  matrix  representation  is  based 
on  the  Poisson  Bracket  generalization  of  Hamiltonian  mechanics,  which,  in  contrast  to  Hamilton’s  canoni- 
cal equations,  allows  for  non-commuting  conjugate  variables.  In  1926  Erwin  Schrodinger  (1887-1961) 
independently  introduced  the  operational  viewpoint  and  reinterpreted  the  partial  differential  equation  of 
Hamilton-Jacobi  as  a wave  equation.  His  starting  point  was  the  optical-mechanical  analogy  of  Hamilton 
that  is  a built-in  feature  of  the  Hamilton-Jacobi  theory.  Schrodinger  then  showed  that  the  wave  mechanics 
he  developed,  and  the  Heisenberg  matrix  mechanics,  are  equivalent  representations  of  quantum  mechanics. 
In  1928  Dirac  developed  his  relativistic  equation  of  motion  for  the  electron  and  pioneered  the  field  of  quan- 
tum electrodynamics.  Dirac  also  introduced  the  Lagrangian  and  the  principle  of  least  action  to  quantum 
mechanics  and  these  ideas  were  developed  into  the  path-integral  formulation  of  quantum  mechanics  and  the 
theory  of  electrodynamics  by  Richard  Feynman(1918-1988). 

The  concepts  of  wave-particle  duality,  and  quantization  of  observables,  both  are  beyond  the  classical 
notions  of  infinite  subdivisions  in  classical  physics.  In  spite  of  the  radical  departure  of  quantum  mechanics 
from  earlier  classical  concepts,  the  basic  feature  of  the  differential  equations  of  quantal  physics  is  their  self- 
adjoint  character  which  means  that  they  are  derivable  from  a variational  principle.  Thus  both  the  Theory 
of  Relativity,  and  quantum  physics  are  consistent  only  with  the  variational  principle  of  mechanics,  and 
not  Newtonian  mechanics.  As  a consequence  Newtonian  mechanics  has  been  dislodged  from  the  throne 
it  occupied  since  1687,  and  the  intellectually  beautiful  and  powerful  variational  principles  of  analytical 
mechanics  have  been  validated 
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Advances  in  classical  mechanics  continue  to  be  made.  For  example,  during  the  past  four  decades  there 
have  been  tremendous  advances  in  the  understanding  of  the  evolution  of  chaos  in  non-linear  systems.  This 
is  due  to  the  availability  of  computers  which  has  reopened  this  interesting  branch  of  classical  mechanics  that 
was  pioneered  by  Henri  Poincare.  Although  classical  mechanics  is  the  most  mature  branch  of  physics  that 
has  been  studied  for  over  5000years,  there  still  are  new  research  opportunities  in  this  field  of  physics. 

References: 

Excellent  sources  of  information  regarding  the  history  of  major  players  in  the  field  of  classical  mechanics 
can  be  found  on  Wikipedia  and  the  book  "Variational  Principle  of  Mechanics"  by  Lanczos. [La49] 


Chapter  2 


Review  of  Newtonian  mechanics 


2.1  Introduction 

It  is  assumed  that  the  reader  has  been  introduced  to  Newtonian  mechanics  applied  to  one  or  two  point  objects. 
This  chapter  reviews  Newtonian  mechanics  for  motion  of  many-body  systems  as  well  as  for  macroscopic 
sized  bodies.  Newton’s  Law  of  Gravitation  also  is  reviewed.  The  purpose  of  this  review  is  to  ensure  that  the 
reader  has  a solid  foundation  of  elementary  Newtonian  mechanics  upon  which  to  build  the  powerful  analytic 
Lagrangian  and  Hamiltonian  approaches  to  classical  dynamics. 

Newtonian  mechanics  is  based  on  application  of  Newton’s  Laws  of  motion  which  assume  that  the  concepts 
of  distance,  time,  and  mass,  are  absolute,  that  is,  motion  is  in  an  inertial  frame.  The  Newtonian  idea  of 
the  complete  separation  of  space  and  time,  and  the  concept  of  the  absoluteness  of  time,  are  violated  by  the 
Theory  of  Relativity  as  discussed  in  chapter  16.  However,  for  most  practical  applications,  relativistic  effects 
are  negligible  and  Newtonian  mechanics  is  an  adequate  description  at  low  velocities.  Therefore  chapters 
3 — 15  will  assume  velocities  for  which  Newton’s  laws  of  motion  are  applicable. 

2.2  Newton’s  Laws  of  motion 

Newton  defined  a vector  quantity  called  linear  momentum  p which  is  the  product  of  mass  and  velocity. 

p = mi  (2.1) 

Since  the  mass  m is  a scalar  quantity,  then  the  velocity  vector  r and  the  linear  momentum  vector  p are 
colinear. 

Newton’s  laws,  expressed  in  terms  of  linear  momentum,  are: 

1 Law  of  inertia:  A body  remains  at  rest  or  in  uniform  motion  unless  acted  upon  by  a force. 

2 Equation  of  motion:  A body  acted  upon  by  a force  moves  in  such  a manner  that  the  time  rate  of  change 
of  momentum  equals  the  force. 

F = w <2'2> 

3 Action  and  reaction:  If  two  bodies  exert  forces  on  each  other  these  forces  are  equal  in  magnitude  and 
opposite  in  direction. 

Newton’s  second  law  contains  the  essential  physics  relating  the  force  F and  the  rate  of  change  of  linear 
momentum  p. 

Newton’s  first  law,  the  law  of  inertia,  is  a special  case  of  Newton’s  second  law  in  that  if 

F = f = ° <23> 

then  p is  a constant  of  motion. 

Newton’s  third  law  also  can  be  interpreted  as  a statement  of  the  conservation  of  momentum,  that  is,  for 
a two  particle  system  with  no  external  forces  acting, 

F12  = — F21  (2.4) 
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If  the  forces  acting  on  two  bodies  are  their  mutual  action  and  reaction,  then  equation  2.4  simplifies  to 


r i it  _ dPi  , dp-2  _ d 


p2)  = 0 


This  implies  that  the  total  linear  momentum  (P  = pi  + P2)  is  a constant  of  motion. 
Combining  equations  2.1  and  2.2  leads  to  a second-order  differential  equation 


F=  -f- 
dt 


dp  d2i 


m 


dt2 


= mr 


(2.5) 


(2.6) 


Note  that  the  force  on  a body  F,  and  the  resultant  acceleration  a = r are  colinear.  Appendix  C 2 gives 
explicit  expressions  for  the  acceleration  a in  cartesian  and  curvilinear  coordinate  systems.  The  definition  of 
force  depends  on  the  definition  of  the  mass  m.  Newton’s  laws  of  motion  are  obeyed  to  a high  precision  for 
velocities  much  less  than  the  velocity  of  light.  For  example,  recent  experiments  have  shown  they  are  obeyed 
with  an  error  in  the  acceleration  of  Aa  < 5 x 10~  14m/s2. 


2.3  Inertial  frames  of  reference 

An  inertial  frame  of  reference  is  one  in  which  Newton’s  Laws  of 
motion  are  valid.  It  is  a non-accelerated  frame  of  reference.  An 
inertial  frame  must  be  homogeneous  and  isotropic.  Physical  ex- 
periments can  be  carried  out  in  different  inertial  reference  frames. 

The  Galilean  transformation  provides  a means  of  converting  be- 
tween two  inertial  frames  of  reference  moving  at  a constant  rel- 
ative velocity.  Consider  two  reference  frames  O and  O'  with  O' 
moving  with  constant  velocity  V at  time  t.  Figure  2.1  shows  a 
Galilean  transformation  which  can  be  expressed  in  vector  form. 

r'  = r — Vi  (2-7) 

t'  = t 

Equation  2.7  gives  the  boost,  assuming  Newton’s  hypothesis 
that  the  time  is  invariant  to  change  of  inertial  frames  of  reference. 

Differentiation  of  this  transformation  gives 

r'  = r - V (2.8) 

r'  = f 

Note  that  the  forces  in  the  primed  and  unprimed  inertial  frames 
are  related  by 

F = -1-  = mr  =mr'  = F'  (2-9) 

dt 

Thus  Newton’s  Laws  of  motion  are  invariant  under  a Galilean  transformation,  that  is,  the  inertial  mass  is 
unchanged  under  Galilean  transformations.  If  Newton’s  laws  are  valid  in  one  inertial  frame  of  reference, 
then  they  are  valid  in  any  frame  of  reference  in  uniform  motion  with  respect  to  the  first  frame  of  reference. 
This  invariance  is  called  Galilean  invariance.  There  are  an  infinite  number  of  possible  inertial  frames  all 
connected  by  Galilean  transformations. 

Galilean  invariance  violates  Einstein’s  Theory  of  Relativity.  In  order  to  satisfy  Einstein’s  postulate 
that  the  laws  of  physics  are  the  same  in  all  inertial  frames,  as  well  as  satisfy  Maxwell’s  equations  for 
electromagnetism,  it  is  necessary  to  replace  the  Galilean  transformation  by  the  Lorentz  transformation.  As 
will  be  discussed  in  chapter  16,  the  Lorentz  transformation  leads  to  Lorentz  contraction  and  time  dilation  both 
of  which  are  related  to  the  parameter  7 = ; J where  c is  the  velocity  of  light  in  vacuum.  Fortunately, 

V 1_(c) 

most  situations  in  life  involve  velocities  where  v <<  c;  for  example,  for  a body  moving  at  25,000m.p.h. 
(11,111  m/s)  which  is  the  escape  velocity  for  a body  at  the  surface  of  the  earth,  the  7 factor  differs  from 
unity  by  about  6.8a;10~10  which  is  negligible.  Relativistic  effects  are  significant  only  in  nuclear  and  particle 
physics  and  some  exotic  conditions  in  astrophysics.  Thus,  for  the  purpose  of  classical  mechanics  usually  it 
is  reasonable  to  assume  that  the  Galilean  transformation  is  valid  and  is  well  obeyed  under  most  practical 
conditions. 


Figure  2.1:  Frame  O'  moving  with  a con- 
stant velocity  V with  respect  to  frame  O 
at  the  time  t. 
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2.4  First-order  integrals  in  Newtonian  mechanics 

A fundamental  goal  of  mechanics  is  to  determine  the  equations  of  motion  for  an  n— body  system,  where 
the  force  Ft  acts  on  the  individual  mass  m*  where  1 < i < n.  Newton’s  second-order  equation  of  motion, 
equation  2.6  must  be  solved  to  calculate  the  instantaneous  spatial  locations,  velocities,  and  accelerations  for 
each  mass  to.;  of  an  n-body  system.  Both  F,  and  ?.;  are  vectors  each  having  three  orthogonal  components. 
The  solution  of  equation  2.6  involves  integrating  second-order  equations  of  motion  subject  to  a set  of  initial 
conditions.  Although  this  task  appears  simple  in  principle,  it  can  be  exceedingly  complicated  for  many-body 
systems.  Fortunately,  solution  of  the  motion  often  can  be  simplified  by  exploiting  three  first-order  integrals 
of  Newton’s  equations  of  motion,  that  relate  directly  to  conservation  of  either  the  linear  momentum,  angular 
momentum,  or  energy  of  the  system.  In  addition,  for  the  special  case  of  these  three  first-order  integrals,  the 
internal  motion  of  any  many-body  system  can  be  factored  out  by  a simple  transformations  into  the  center 
of  mass  of  the  system.  As  a consequence,  the  following  three  first-order  integrals  are  exploited  extensively 
in  classical  mechanics. 


2.4.1  Linear  Momentum 

Newton’s  Laws  can  be  written  as  the  differential  and  integral  forms  of  the  first-order  time  integral  which 
equals  the  change  in  linear  momentum.  That  is 

r‘  = It  / F-dt = l it* = (Pa  - Pi)-  (210) 

This  allows  Newton’s  law  of  motion  to  be  expressed  directly  in  terms  of  the  linear  momentum  p,  = m;r;  of 
each  of  the  1 < i < n bodies  in  the  system.  This  first-order  time  integral  features  prominently  in  classical 
mechanics  since  it  connects  to  the  important  concept  of  linear  momentum  p.  This  first-order  time  integral 
gives  that  the  total  linear  momentum  is  a constant  of  motion  when  the  sum  of  the  external  forces  is  zero. 


2.4.2  Angular  momentum 

The  angular  momentum  L?;  of  a particle  i with  linear  momentum  p;  with  respect  to  an  origin  from  which 
the  position  vector  r;  is  measured,  is  defined  by 

Li  = Yi  x Pi  (2.11) 

The  torque,  or  moment  of  the  force  N;  with  respect  to  the  same  origin  is  defined  to  be 

Ni  = r;  x Fi  (2.12) 


where  r;  is  the  position  vector  from  the  origin  to  the  point  where  the  force  F.;  is  applied.  Note  that  the 
torque  N;  can  be  written  as 

N,  = r.t  x ^ (2.13) 

Consider  the  time  differential  of  the  angular  momentum, 


dLi 

dt 


d . dri 

— r;  X P; ) = — — X Pi  + r;  X 

dt  y y ' dt  H 


dpi 

dt 


(2.14) 


However, 


dri  drt  dri 

x p;  = m——  x — — = 0 
dt  dt  dt 


(2.15) 


Equations  2.13  — 2.15  can  be  used  to  write  the  first-order  time  integral  for  angular  momentum  in  either 
differential  or  integral  form  as 


dLi  dpt 

= r i x — — = 

dt  dt 


d^dt=(L^Ll)i 


(2.16) 


Newton’s  Law  relates  torque  and  angular  momentum  about  the  same  axis.  When  the  torque  about  any  axis 
is  zero  then  angular  momentum  about  that  axis  is  a constant  of  motion.  If  the  total  torque  is  zero  then  the 
total  angular  momentum,  as  well  as  the  components  about  three  orthogonal  axes,  all  are  constants. 
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2.4.3  Kinetic  energy 

The  third  first-order  integral,  that  can  be  used  for  solving  the  equations  of  motion,  is  the  first-order  spatial 
integral  jf  F,  • dr  j . Note  that  this  spatial  integral  is  a scalar  in  contrast  to  the  first-order  time  integrals  for 
linear  and  angular  momenta  which  are  vectors.  The  work  done  on  a mass  rrij  by  a force  Fj  in  transforming 
from  condition  1 to  2 is  defined  to  be 

[W12\i  = J Fj  • dvi  (2.17) 

If  F,  is  the  net  resultant  force  acting  on  a particle  i,  then  the  integrand  can  be  written  as 


Fj  • dv.i  = 


dPi 

dt 


dr;  = TO, 


dv  j 
dt 


dr-j  ,,  dv4 

~dtdt  = ”**■ W 


' w-idt  = 


to,;  d 
2 dt 


(v*  • Vj) dt  = d\  -rriiVi  ) = d [T]i  (2.18) 


where  the  kinetic  energy  of  a particle  i is  defined  as 

[Th  = \nriivl  (2.19) 

Thus  the  work  done  on  the  particle  i.  that  is,  [Wi2]i  equals  the  change  in  kinetic  energy  of  the  particle  if 
there  is  no  change  in  other  contributions  to  the  total  energy  such  as  potential  energy,  heat  dissipation,  etc. 
That  is 

= [T2  - T1]i  (2.20) 

Thus  the  differential,  and  corresponding  first  integral,  forms  of  the  kinetic  energy  can  be  written  as 

dT  f2 

F?:  = J Fi-dri  = (T2-T1)i  (2.21) 

If  the  work  done  on  the  particle  is  positive,  then  the  final  kinetic  energy  T2  > T\.  Especially  noteworthy  is  that 
the  kinetic  energy  \T]i  is  a scalar  quantity  which  makes  it  simple  to  use.  This  first-order  spatial  integral  is  the 
foundation  of  the  analytic  formulation  of  mechanics  that  underlies  Lagrangian  and  Hamiltonian  mechanics. 


[W12]f  = 


1 2 1 0 

-mv2  - -mv1 


2.5  Conservation  laws  in  classical  mechanics 

Elucidating  the  dynamics  in  classical  mechanics  is  greatly  simplified  when  conservation  laws  are  applicable. 
In  nature,  isolated  many-body  systems  frequently  conserve  one  or  more  of  the  first-order  integrals  for  linear 
momentum,  angular  momentum,  and  mass/energy.  Note  that  mass  and  energy  are  coupled  in  the  Theory 
of  Relativity,  but  for  non-relativistic  mechanics  the  conservation  of  mass  and  energy  are  decoupled.  Other 
observables  such  as  lepton  and  baryon  numbers  are  conserved,  but  these  conservation  laws  usually  can  be 
subsumed  under  conservation  of  mass  for  most  problems  in  non-relativistic  classical  mechanics.  The  power 
of  conservation  laws  in  calculating  classical  dynamics  makes  it  useful  to  combine  the  conservation  laws 
with  the  first  integrals  for  linear  momentum,  angular  momentum,  and  work-energy,  when  solving  problems 
involving  Newtonian  mechanics.  These  three  conservation  laws  will  be  derived  assuming  Newton’s  laws  of 
motion,  however,  these  conservation  laws  are  fundamental  laws  of  nature  that  apply  well  beyond  the  domain 
of  applicability  of  Newtonian  mechanics. 


2.6  Motion  of  finite-sized  and  many-body  systems 

Elementary  presentations  in  classical  mechanics  discuss  motion  and  forces  involving  single  point  particles. 
However,  in  real  life,  single  bodies  have  a finite  size  introducing  new  degrees  of  freedom  such  as  rotation  and 
vibration,  and  frequently  many  finite-sized  bodies  are  involved.  A finite-sized  body  can  be  thought  of  as  a 
system  of  interacting  particles  such  as  the  individual  atoms  of  the  body.  The  interactions  between  the  parts 
of  the  body  can  be  strong  which  leads  to  rigid  body  motion  where  the  positions  of  the  particles  are  held 
fixed  with  respect  to  each  other,  and  the  body  can  translate  and  rotate.  When  the  interaction  between  the 
bodies  is  weaker,  such  as  for  a diatomic  molecule,  additional  vibrational  degrees  of  relative  motion  between 
the  individual  atoms  are  important.  Newton’s  third  law  of  motion  becomes  especially  important  for  such 
many-body  systems. 
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2.7  Center  of  mass  of  a many-body  system 


A finite  sized  body  needs  a reference  point  with  respect 
to  which  the  motion  can  be  described.  For  example, 
there  are  8 corners  of  a cube  that  could  server  as  ref- 
erence points,  but  the  motion  of  each  corner  is  compli- 
cated if  the  cube  is  both  translating  and  rotating.  The 
treatment  of  the  behavior  of  finite-sized  bodies,  or  many- 
body  systems,  is  greatly  simplified  using  the  concept  of 
center  of  mass.  The  center  of  mass  is  a particular  fixed 
point  in  the  body  that  has  an  especially  valuable  prop- 
erty; that  is,  the  translational  motion  of  a finite  sized 
body  can  be  treated  like  that  of  a point  mass  located  at 
the  center  of  mass.  In  addition  the  translational  motion 
is  separable  from  the  rotational-vibrational  motion  of  a 
many-body  system  when  the  motion  is  described  with 
respect  to  the  center  of  mass.  Thus  it  is  convenient  at 
this  juncture  to  introduce  the  concept  of  center  of  mass 
of  a many-body  system. 

For  a many-body  system,  the  position  vector  r,: , de- 
fined relative  to  the  laboratory  system,  is  related  to  the 
position  vector  r'  with  respect  to  the  center  of  mass,  and 
the  center-of-mass  location  R relative  to  the  laboratory 
system.  That  is,  as  shown  in  figure  2.2 


Figure  2.2:  Position  vector  with  respect  to  the 
center  of  mass. 


(2.22) 


r,  = R 


This  vector  relation  defines  the  transformation  between  the  laboratory  and  center  of  mass  systems.  For 
discrete  and  continuous  systems  respectively,  the  location  of  the  center  of  mass  is  uniquely  defined  as  being 
where 

n 

m.l  r-  = 

i 


/ 


r'pdV  = 0. 


(Center  of  mass  definition) 


Define  the  total  mass  M as 


M = mi  = 


body 


pdV 


(Total  mass) 


The  average  location  of  the  system  corresponds  to  the  location  of  the  center  of  mass  since  jj  rm r(  = 0, 

that  is 


1 

M 


E m,;rt 


R+^£m^ 


(2.23) 


l 


l 


The  vector  R,  which  describes  the  location  of  the  center  of  mass,  depends  on  the  origin  and  coordinate 
system  chosen.  For  a continuous  mass  distribution  the  location  vector  of  the  center  of  mass  is  given  by 


l 


miTi  = 


(2.24) 


The  center  of  mass  can  be  evaluated  by  calculating  the  individual  components  along  three  orthogonal  axes. 

The  center-of-mass  frame  of  reference  is  defined  as  the  frame  for  which  the  center  of  mass  is  stationary. 
This  frame  of  reference  is  especially  valuable  for  elucidating  the  underlying  physics  which  involves  only  the 
relative  motion  of  the  many  bodies.  That  is,  the  trivial  translational  motion  of  the  center  of  mass  frame, 
which  has  no  influence  on  the  relative  motion  of  the  bodies,  is  factored  out  and  can  be  ignored.  For  example, 
a tennis  ball  (0.06%)  approaching  the  earth  (6  x 10 24kg)  with  velocity  v could  be  treated  in  three  frames, 
(a)  assume  the  earth  is  stationary,  (b)  assume  the  tennis  ball  is  stationary,  or  (c)  the  center-of-mass  frame. 
The  latter  frame  ignores  the  center  of  mass  motion  which  has  no  influence  on  the  relative  motion  of  the 
tennis  ball  and  the  earth.  The  center  of  linear  momentum  and  center  of  mass  coordinate  frames  are  identical 
in  Newtonian  mechanics  but  not  in  relativistic  mechanics  as  described  in  chapter  16.4.3. 
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CHAPTER  2.  REVIEW  OF  NEWTONIAN  MECHANICS 


2.8  Total  linear  momentum  of  a many-body  system 

2.8.1  Center-of-mass  decomposition 

The  total  linear  momentum  P for  a system  of  n particles  is  given  by 

n d n 

p = p* = mtVi  (2-25) 

i i 

It  is  convenient  to  describe  a many-body  system  by  a position  vector  r'  with  respect  to  the  center  of  mass. 

r<  = R + r'  (2.26) 


That  is, 


p = if  P*  = 4 E mr,  = 4a/R  + 4 •£  m,r[  - Mir  + 0 = MR 


dt  ' dt  dt  ' * dt 

It  l 

since  rrijr(  = 0 as  given  by  the  definition  of  the  center  of  mass.  That  is; 

P = MR 


(2.27) 


(2.28) 


Thus  the  total  linear  momentum  for  a system  is  the  same  as  the  momentum  of  a single  particle  of  mass 
M = YfJi  m-i  located  at  the  center  of  mass  of  the  system. 


2.8.2  Equations  of  motion 

The  force  acting  on  particle  i,  in  an  n-particle  many-body  system,  can  be  separated  into  an  external  force 
Ffxt  plus  internal  forces  f)j  between  the  n particles  of  the  system 

n 

F4  = Ff  + ^%  (2.29) 

j 

iAj 


The  origin  of  the  external  force  is  from  outside  of  the  system  while  the  internal  force  is  due  to  the  mutual 
interaction  between  the  n particles  in  the  system.  Newton’s  Law  tells  us  that 


p,=F.  = Ff  + ^£,j 


3 

i+3 


Thus  the  rate  of  change  of  total  momentum  is 


p=Ep<=EFf+EI> 


* 3 


Note  that  since  the  indices  are  dummy  then 


(2.30) 


(2.31) 


EEf«  = EEt. 


» 3 


7 & 

i^j 


(2.32) 


Substituting  Newton’s  third  law  f;,  = — fE  into  equation  2.32  implies  that 


EEf^EEfi--EEf«  = ° 


» 3 


7 1 


* 3 

i¥=3 


(2.33) 
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which  is  satisfied  only  for  the  case  where  the  summations  equal  zero.  That  is,  for  every  internal  force,  there 
is  an  equal  and  opposite  reaction  force  that  cancels  that  internal  force. 

Therefore  the  first-order  integral  for  linear  momentum  can  be  written  in  differential  and  integral  forms 


as 


'•  N,r 


(2.34) 


The  reaction  of  a body  to  an  external  force  is  equivalent  to  a single  particle  of  mass  M located  at  the  center 
of  mass  assuming  that  the  internal  forces  cancel  due  to  Newton’s  third  law. 

Note  that  the  total  linear  momentum  P is  conserved  if  the  net  external  force  F£  is  zero,  that  is 


Fb  = — — = 0 (2.35) 

dt  y ’ 

Therefore  the  P of  the  center  of  mass  is  a constant.  Moreover,  if  the  component  of  the  force  along  any 
direction  e is  zero,  that  is, 

Fe  ■ e = -—r—  = 0 (2.36) 

dt 

then  P • e is  a constant.  This  fact  is  used  frequently  to  solve  problems  involving  motion  in  a constant  force 
field.  For  example,  in  the  earth’s  gravitational  field,  the  momentum  of  an  object  moving  in  vacuum  in  the 
vertical  direction  is  time  dependent  because  of  the  gravitational  force,  whereas  the  horizontal  component  of 
momentum  is  constant  if  no  forces  act  in  the  horizontal  direction. 


2.1  Example:  Exploding  cannon  shell 

Consider  a cannon  shell  of  mass  M moves  along  a parabolic  trajectory  in  the  earths  gravitational  field. 
An  internal  explosion,  generating  an  amount  E of  mechanical  energy,  blows  the  shell  into  two  parts.  One 
part  of  mass  kM,  where  k < 1,  continues  moving  along  the  same  trajectory  with  velocity  v'  while  the  other 
part  is  reduced  to  rest.  Find  the  velocity  of  the  mass  kM  immediately  after  the  explosion. 

It  is  important  to  remember  that  the  energy  release  E is  given  in 
the  center  of  mass.  If  the  velocity  of  the  shell  immediately  before  the 
explosion  is  v and  v'  is  the  velocity  of  the  kM  part,  immediately  after  the 
explosion,  then  energy  conservation  gives  that  \ Mv2  + E = \ kMv'2T  . 

The  conservation  of  linear  momentum  gives  Mv  = kMv' . Eliminating 
v from  these  equations  gives 


2.2  Example:  Billiard-ball  collisions 

A billiard  ball  with  mass  m and  incident  velocity  v collides  with  an  identical  stationary  ball.  Assume  that 
the  balls  bounce  off  each  other  elastically  in  such  a way  that  the  incident  ball  is  deflected  at  a scattering  angle 
9 to  the  incident  direction.  Calculate  the  final  velocities  Vf  and  Vf  of  the  two  balls  and  the  scattering  angle  tf> 
of  the  target  ball.  The  conservation  of  linear  momentum  in  the  incident  direction  x gives  mv  = mvf  cos  9 + 
mVf  cos  (f>.  The  linear  momentum  in  the  perpendicular  direction  gives  0 = mvf  sin#  — mV f sin  fiThe  energy 
is  conserved  since  the  collision  is  elastic.  Thus 


Solving  these  three  equations  gives  = 90°  — 9,  that  is,  the  balls  bounce  off  perpendicular  to  each  other  in 
the  laboratory  frame.  The  final  velocities  are 


\ \ 


(l-k)lVH 


Exploding  cannon  shell 


Vf 

Vf 


v cos  9 
v sin  9 
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2.9  Angular  momentum  of  a many-body  system 

2.9.1  Center-of-mass  decomposition 

As  was  the  case  for  linear  momentum,  for  a many-body  system  it  is  possible  to  separate  the  angular  mo- 
mentum into  two  components.  One  component  is  the  angular  momentum  about  the  center  of  mass  and  the 
other  component  is  the  angular  motion  of  the  center  of  mass  about  the  origin  of  the  coordinate  system.  This 
separation  is  done  by  describing  the  angular  momentum  of  a many-body  system  using  a position  vector  r' 
with  respect  to  the  center  of  mass  plus  the  vector  location  R of  the  center  of  mass. 

ri  = R + r'  (2.37) 


The  total  angular  momentum 

L 


n n 

i i 

n 

^(R  + r')  x m;  (R  + r') 

i 


r'  x r'  + r'  x R + R x r'  + R x R 


(2.38) 


Note  that  if  the  position  vectors  are  with  respect  to  the  center  of  mass,  then  miri  = 0 resulting  in  the 
middle  two  terms  in  the  bracket  being  zero,  that  is; 

n 

L = ri  X Pi  + R X P (2‘39) 

i 

The  total  angular  momentum  separates  into  two  terms,  the  angular  momentum  about  the  center  of  mass, 
plus  the  angular  momentum  of  the  center  of  mass  about  the  origin  of  the  axis  system.  This  factoring  of  the 
angular  momentum  only  applies  for  the  center  of  mass.  This  is  called  Samuel  Konig’s  first  theorem. 


2.9.2  Equations  of  motion 

The  time  derivative  of  the  angular  momentum 


But 


Thus 


id 

Li  = "T7  ri  X Pi  = r;X  p,  + r,:  x p, 
dt 


(2.40) 


r i x pi  = miti  x Yi  = 0 
Lj  = 17  x pi  = r,:  x F4  = N i 


(2.41) 

(2.42) 


Consider  that  the  resultant  force  acting  on  particle  i in  this  n-particle  system  can  be  separated  into  an 
external  force  Yfxt  plus  internal  forces  between  the  n particles  of  the  system 


F'=Ff +^fy  (2-43) 

j 

The  origin  of  the  external  force  is  from  outside  of  the  system  while  the  internal  force  is  due  to  the  interaction 
with  the  other  n — 1 particles  in  the  system.  Newton’s  Law  tells  us  that 

n 

Pi  = Fi  = Ff  + 

j 

i+3 


(2.44) 
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The  rate  of  change  of  total  angular  momentum  is 

l = ^ r‘ x p* = Vi  x + X]  H r* x fiJ  (2A5) 

i i i i j 

Since  f ^ = — fjj  the  last  expression  can  be  written  as 

EEr‘x  Uj  = YU2(ri~  ri ) x fb-  (2-46) 

i j i j 

i<j 

Note  that  (r,  — r j)  is  the  vector  r ^ connecting  j to  i.  For  central  forces  the  force  vector  f \j  = fij^fj  thus 

X " ri)  x fb  r"  x fa***  = 0 ^2-47) 

* j i j 

i<j  i<j 

That  is,  for  central  internal  forces  the  total  internal  torque  on  a system  of  particles  is  zero,  and  the  rate  of 
change  of  total  angular  momentum  for  central  internal  forces  becomes 

L = n X Ff  = ^ Nf  = Ne  (2.48) 

i i 

where  NE  is  the  net  external  torque  acting  on  the  system.  Equation  2.48  leads  to  the  differential  and  integral 
forms  of  the  first  integral  relating  the  total  angular  momentum  to  total  external  torque. 

2 

L = Ne  j N Edt  = L2  - Li  (2.49) 

1 

Angular  momentum  conservation  occurs  in  many  problems  involving  zero  external  torques  N£  = 0,  plus 
two-body  central  forces  F =/(r)r  since  the  torque  on  the  particle  about  the  center  of  the  force  is  zero 

N = r x F =/(r)[ r x r]  =0  (2.50) 

Examples  are,  the  central  gravitational  force  for  stellar  or  planetary  systems  in  astrophysics,  and  the  central 
electrostatic  force  manifest  for  motion  of  electrons  in  the  atom.  In  addition,  the  component  of  angular 
momentum  about  any  axis  L.e  is  conserved  if  the  net  external  torque  about  that  axis  N.e  =0. 

2.3  Example:  Bolas  thrown  by  gaucho 

Consider  the  bolas  thrown  by  a gaucho  to  catch  cattle.  This  is  a 
system  with  conserved  linear  and  angular  momentum  about  certain 
axes.  When  the  bolas  leaves  the  gaucho ’s  hand  the  center  of  mass 
has  a linear  velocity  V and  an  angular  momentum  about  the  center 
of  mass  of  L.  If  no  external  torques  act,  then  the  center  of  mass  of 
the  bolas  will  follow  a typical  ballistic  trajectory  in  the  earth's  grav- 
itational field  while  the  angular  momentum  vector  L is  conserved, 
that  is,  both  in  magnitude  and  direction.  The  tension  in  the  ropes 
connecting  the  three  balls  does  not  impact  the  motion  of  the  system 
as  long  as  the  ropes  do  not  snap  due  to  centrifugal  forces. 


Bolas  thrown  by  a gaucho 


18 


CHAPTER  2.  REVIEW  OF  NEWTONIAN  MECHANICS 


2.10  Work  and  kinetic  energy  for  a many-body  system 

2.10.1  Center-of-mass  kinetic  energy 

For  a many-body  system  the  position  vector  r'  with  respect  to  the  center  of  mass  is  given  by. 

r;  = R + r'  (2.51) 

The  location  of  the  center  of  mass  is  uniquely  defined  as  being  at  the  location  where  f pr'pJV  = 0.  The 
velocity  of  the  ith  particle  can  be  expressed  in  terms  of  the  velocity  of  the  center  of  mass  R plus  the  velocity 
of  the  particle  with  respect  to  the  center  of  mass  r'  . That  is, 

f<  = R + r'  (2.52) 


The  total  kinetic  energy  T is 


n i n l n l (d  \ 1 

T = 2miV *2  = ^ 2miii  'ii  = ^2  2m<^  ' ^ + ( dt  ^ ) ‘ ^ + ‘ ^ (2-53) 


For  the  special  case  of  the  center  of  mass,  the  middle  term  is  zero  since,  by  definition  of  the  center  of  mass, 
miT'i  = 0.  Therefore 

n i i 

T = E + oMV2  (2-^4) 


Thus  the  total  kinetic  energy  of  the  system  is  equal  to  the  sum  of  the  kinetic  energy  of  a mass  M moving 
with  the  center  of  mass  velocity  plus  the  kinetic  energy  of  motion  of  the  individual  particles  relative  to  the 
center  of  mass.  This  is  called  Samuel  Konig’s  second  theorem. 

Note  that  for  a fixed  center-of-mass  energy,  the  total  kinetic  energy  T has  a minimum  value  of  1 2 
when  the  velocity  of  the  center  of  mass  V = 0.  For  a given  internal  excitation  energy,  the  minimum  energy 
required  to  accelerate  colliding  bodies  occurs  when  the  colliding  bodies  have  identical,  but  opposite,  linear 
momenta.  That  is,  when  the  center-of-mass  velocity  V = 0. 


2.10.2  Conservative  forces  and  potential  energy 

In  general,  the  line  integral  of  a force  field  F,  that  is,  f f F-dr,  is  both  path  and  time  dependent.  However, 
an  important  class  of  forces,  called  conservative  forces,  exist  for  which  the  following  two  facts  are  obeyed. 

1)  Time  independence: 

The  force  depends  only  on  the  particle  position  r,  that  is,  it  does  not  depend  on  velocity  or  time. 

2)  Path  independence: 

For  any  two  points  1 and  2,  the  work  done  by  F is  independent  of  the  path  taken  between  1 and  2. 

If  forces  are  path  independent,  then  it  is  possible  to  define  a scalar  field,  called  potential  energy  and 
denoted  by  U( r),  that  is  only  a function  of  position.  The  path  independence  can  be  expressed  by  noting 
that  the  integral  around  a closed  loop  is  zero.  That  is 

<j>  F • dr  = 0 (2.55) 

Applying  Stokes  theorem  for  a path-independent  force  leads  to  the  alternate  statement  that  the  curl  is  zero. 
See  appendix  G.3.3. 

V x F = 0.  (2.56) 

Note  that  the  vector  product  of  two  del  operators  V acting  on  a scalar  field  U equals 

V x VC/  = 0 (2.57) 

Thus  it  is  possible  to  express  a path-independent  force  field  as  the  gradient  of  a scalar  field,  U,  that  is 


F = -VC/ 


(2.58) 
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Then  the  spatial  integral 

^ F-dr  = -^  (VC/)  ■ dv  = U\  — U2  (2.59) 

Thus  for  a path-independent  force,  the  work  done  on  the  particle  is  given  by  the  change  in  potential  energy 
if  there  is  no  change  in  kinetic  energy.  For  example,  if  an  object  is  lifted  against  the  gravitational  field,  then 
work  is  done  on  the  particle  and  the  final  potential  energy  U2  exceeds  the  initial  potential  energy,  U\. 

2.10.3  Total  mechanical  energy 

The  total  mechanical  energy  E of  a particle  is  defined  as  the  sum  of  the  kinetic  and  potential  energies. 

E = T + U (2.60) 

Note  that  the  potential  energy  is  defined  only  to  within  an  additive  constant  since  the  force  F = — VC/ 
depends  only  on  difference  in  potential  energy.  Similarly,  the  kinetic  energy  is  not  absolute  since  any  inertial 
frame  of  reference  can  be  used  to  describe  the  motion  and  the  velocity  of  a particle  depends  on  the  relative 
velocities  of  inertial  frames.  Thus  the  total  mechanical  energy  E = T + U is  not  absolute. 

If  a single  particle  is  subject  to  several  path-independent  forces,  such  as  gravity,  linear  restoring  forces, 
etc.,  then  a potential  energy  C/;  can  be  ascribed  to  each  of  the  m forces  where  for  each  force  F,  = — VC/*.  In 

m 

contrast  to  the  forces,  which  add  vectorially,  these  scalar  potential  energies  are  additive,  U = J^Ui.  Thus 

i 

the  total  mechanical  energy  for  m potential  energies  equals 


E = T+U{r)  = T + YJu>{r) 

i 

The  time  derivative  of  the  total  mechanical  energy  E = T + U,  equals 

dE__dT  dU_ 
dt  dt  dt 

Equation  2.18  gave  that  dT  = F • dr.  Thus,  the  first  term  in  equation  2.62  equals 

dT  dr 

— F — 

dt  dt 


(2.61) 


(2.62) 


(2.63) 


The  potential  energy  can  be  a function  of  both  position  and  time.  Thus  the  time  difference  in  potential 
energy  due  to  change  in  both  time  and  position  is  given  as 


dU 

dt 


\ - dU  dxj  dU  _ dr 

/ j YI4-  ' V V U ) 


dx;  dt 


dt 


dt 


dU 

~dt 


(2.64) 


The  time  derivative  of  the  total  mechanical  energy  is  given  using  equations  2.63,2.64  in  equation  2.62. 


dE_ 

dt 


d T 
dt 


dU 


= F 


dr 

dt 


,„rT.  dr  dU  d,r 

.^  + - = [F+(W)].- 


dU 

~dt 


(2.65) 


dt  dt  dt 

Note  that  if  the  field  is  path  independent,  that  is  V x F = 0,  then  the  force  and  potential  are  related  by 


F = -VJ7 


(2.66) 


Therefore,  for  path  independent  forces,  the  first  term  in  the  time  derivative  of  the  total  energy  in  equation 
2.65  is  zero.  That  is, 


dE  dU 
dt  dt 


(2.67) 


In  addition,  when  the  potential  energy  U is  not  an  explicit  function  of  time,  then  ^ = 0 and  thus  the  total 
energy  is  conserved.  That  is,  for  the  combination  of  (a)  path  independence  plus  (b)  time  independence,  then 
the  total  energy  of  a conservative  field  is  conserved. 
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Note  that  there  are  cases  where  the  concept  of  potential  still  is  useful  even  when  it  is  time  dependent. 
That  is,  if  path  independence  applies,  i.e.  F = —VU  at  any  instant.  For  example,  a Coulomb  field  problem 
where  charges  are  slowly  changing  due  to  leakage  etc.,  or  during  a peripheral  collision  between  two  charged 
bodies  such  as  nuclei. 


2.4  Example:  Central  force 

A particle  of  mass  m moves  along  a trajectory  given  by  x = xocoswit  and  y = yosina^t. 

a)  Find  the  x and  y components  of  the  force  and  determine  the  condition  for  which  the  force  is  a central 
force. 

Differentiating  with  respect  to  time  gives 

x = — xoWi  sin  (wit)  x = — xqoj\  cos  (uqf) 

y = -2/0^2  cos  [u2t)  y = -y0u22  sin  (^2<) 


Newton’s  second  law  gives 

F =m  ( xi+yj ) = —m  [xo u\  cos  (u>i t)  i + y§u\  sin  (w2t)  j]  = — m \u\xi  + w^yj] 
Note  that  if  uj\  = u2  = u>  then 


F = = -mw  \xi- 


= —mu  r 


That  is,  it  is  a central  force  if  u\  = u2  = u. 

b)  Find  the  potential  energy  as  a function  of  x and  y. 

Since 

„ „TT  :du „ du; 

F = -VU  = - | — i + — j 
ox  oy 


then 


u = (^i^2  + uly2) 


assuming  that  U = 0 at  the  origin. 

c)  Determine  the  kinetic  energy  of  the  particle  and  show  that  it  is  conserved. 

The  total  energy 

E = T + U = (x2  + y2)  + \m  (u\x2  + u\y 2)  = (x^ul  + y^uf) 


since  cos2 9 + sin2  9 = 1.  Thus  the  total  energy  E is  a constant  and  is  conserved. 


2.10.4  Total  mechanical  energy  for  conservative  systems 

Equation  2.20  showed  that,  using  Newton’s  second  law,  F = -ff,  the  first-order  spatial  integral  gives  that 
the  work  done  W\2  is  related  to  the  change  in  the  kinetic  energy.  That  is, 

W12  = / F ■ dr  = ^ mv 2 — \mv\  =T2—T\  (2.68) 

J l 2 2 

The  work  done  W\2  also  can  be  evaluated  in  terms  of  the  known  forces  F.t  in  the  spatial  integral. 

Consider  that  the  resultant  force  acting  on  particle  i in  this  n-particle  system  can  be  separated  into  an 
external  force  F fxt  plus  internal  forces  between  the  n particles  of  the  system 

n 

I'.  = Ff  • (2-69) 

j 

The  origin  of  the  external  force  is  from  outside  of  the  system  while  the  internal  force  is  due  to  the  interaction 
with  the  other  n — 1 particles  in  the  system.  Newton’s  Law  tells  us  that 

n 

Pi  = Fi  = F f + J2{a 

j 

i+3 


(2.70) 


2.10.  WORK  AND  KINETIC  ENERGY  FOR  A MANY-BODY  SYSTEM 
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The  work  done  on  the  system  by  a force  moving  from  configuration  1 — 2 is  given  by 

n -2  n n „2 


Z I Ff-dr1  + J2T,  % • dC 


Since  fy  = — fj,;  then 


TL  «2  /»Z 

^ j Ff  • dr,  + E E / • (*i  - *i) 


* J 


n n r 2 


(2.71) 


(2.72) 


* 3 
i<j 


where  dr,;  — drj  = dry  is  the  vector  from  j to  i. 

Assume  that  both  the  external  and  internal  forces  are  conservative,  and  thus  can  be  derived  from  time 
independent  potentials,  that  is 

(2.73) 

(2.74) 


Ff  = 


Ext 


fij  — 


Int 


Then 


TTi_ 


>2  = 


n /*2 


ViU?3*  ■ dr,  - 


n n r 2 

??/ 

i<j 


v.r//"  • dry 


= E ^ (!)  - E Of **(2)  + E (!)  - E Uint(?) 

i i i i 

= UExt{  1)  - UExt( 2)  + C//nt(l)  - UInt(2) 

Define  the  total  external  potential  energy, 

n 

uExt  = E u? 

and  the  total  internal  energy 


jExt 


u 


Int 


Yu< 


Int 


Equating  the  two  equivalent  equations  for  Wi_> 2,  that  is  2.68  and  2. 75. gives  that 
Wi_2  =T2^T1  = UExt(  1)  - UExt{ 2)  + UInt(l)  - UInt{ 2) 


(2.75) 


(2.76) 


(2.77) 


(2.78) 


Regroup  these  terms  in  equation  2.78  gives 

T\  + C/Ba:*(l)  + f//Tl‘(l)  = T2  + UExt(2)  + UInt{  2) 

This  shows  that,  for  conservative  forces,  the  total  energy  is  conserved  and  is  given  by 

E = T + UExt  + UInt  (2.79) 

The  three  first-order  integrals  for  linear  momentum,  angular  momentum,  and  energy  provide  powerful 
approaches  for  solving  the  motion  of  Newtonian  systems  due  to  the  applicability  of  conservation  laws  for  the 
corresponding  linear  and  angular  momentum  plus  energy  conservation  for  conservative  forces.  In  addition, 
the  important  concept  of  center-of-mass  motion  naturally  separates  out  for  these  three  first-order  integrals. 
Although  these  conservation  laws  were  derived  assuming  Newton’s  Laws  of  motion,  these  conservation  laws 
are  more  generally  applicable,  and  these  conservation  laws  surpass  the  range  of  validity  of  Newton’s  Laws  of 
motion.  For  example,  in  1930  Pauli  and  Fermi  postulated  the  existence  of  the  neutrino  in  order  to  account  for 
non-conservation  of  energy  and  momentum  in  /3-decay  because  they  did  not  wish  to  relinquish  the  concepts 
of  energy  and  momentum  conservation.  The  neutrino  was  first  detected  in  1956  confirming  the  correctness 
of  this  hypothesis. 
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2.11  Virial  Theorem 


The  virial  theorem  is  an  important  theorem  for  a system  of  moving  particles  both  in  classical  physics  and 
quantum  physics.  The  Virial  Theorem  is  useful  when  considering  a collection  of  many  particles  and  has  a 
special  importance  to  central-force  motion.  For  a general  system  of  mass  points  with  position  vectors  r,;  and 
applied  forces  F.(,  consider  the  scalar  product  G 


G = £ Pi  • ri 

i 


where  i sums  over  all  particles.  The  time  derivative  of  G is 


dG 

dt 


i i 


(2.80) 


(2.81) 


However, 

£»■ 

i 

■ i \ = mil  ■ li  = ^ mv 2 = 2 T 

i i 

(2.82) 

Also,  since  p,  = F, 

£p<  t.  = £f«  'r* 

i i 

(2.83) 

Thus 

dG  v-^ 

_-2r  + ^F,t.rt 

i 

(2.84) 

The  time  average  over  a period  r is 

1 fT  dG  , 

- / ~rdt 

T J o dt 

= G(r)-O(0)  =(2T)  + ^Pj  ,r^ 

(2.85) 

where  the  ()  brackets  refer  to  the  time  average.  Note  that  if  the  motion  is  periodic  and  the  chosen  time  r 
equals  a multiple  of  the  period,  then  GV)-G'(0)  _ q gven  £iie  m0y0n  is  not  periodic,  if  the  constraints  and 
velocities  of  all  the  particles  remain  finite,  then  there  is  an  upper  bound  to  G.  This  implies  that  choosing 
t oo  means  that  gP)-gO  _>  q.  In  both  cases  the  left-hand  side  of  the  equation  tends  to  zero  giving  the 


virial  theorem 


(2.86) 


The  right-hand  side  of  this  equation  is  called  the  virial  of  the  system.  For  a single  particle  subject  to  a 
conservative  central  force  F = — 'VU  the  Virial  theorem  equals 


(T)  = \{VU-  r)  = i 


(2.87) 


If  the  potential  is  of  the  form  U = krn+1  that  is,  F = —k(n  + 1 )r",  then  = (n  + 1)U . Thus  for  a single 
particle  in  a central  potential  U = krn+1  the  Virial  theorem  reduces  to 

(T)  = {U)  (2.88) 

The  following  two  special  cases  are  of  considerable  importance  in  physics. 

Hooke’s  Law:  Note  that  for  a linear  restoring  force  n = 1 then 


(' T)  = +(U ) 


In  = 1) 


You  may  be  familiar  with  this  fact  for  simple  harmonic  motion  where  the  average  kinetic  and  potential 
energies  are  the  same  and  both  equal  half  of  the  total  energy. 


2.11.  VIRIAL  THEOREM 
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Inverse-square  law:  The  other  interesting  case  is  for  the  inverse  square  law  n = —2  where 

{T)=-l-{U)  (n  = —2) 

The  Virial  theorem  is  useful  for  solving  problems  in  that  knowing  the  exponent  n of  the  field  makes  it 
possible  to  write  down  directly  the  average  total  energy  in  the  field.  For  example,  for  n = —2 

(E)  = ( T ) + {U)  = -\  ( U ) + (U)  = \ (U)  (2.89) 

This  occurs  for  the  Bohr  model  of  the  hydrogen  atom  where  the  kinetic  energy  of  the  bound  electron  is  half 
of  the  potential  energy.  The  same  result  occurs  for  planetary  motion  in  the  solar  system. 

2.5  Example:  The  ideal  gas  law 

The  Virial  theorem  deals  with  average  properties  and  has  applications  to  statistical  mechanics.  Consider 
an  ideal  gas.  According  to  the  equipartition  theorem  the  average  kinetic  energy  per  atom  in  an  ideal  gas  is 
%kT  where  T is  the  absolute  temperature  and  k is  the  Boltzmann  constant.  Thus  the  average  total  kinetic 
energy  for  N atoms  is  ( KE ) = | NkT . The  right-hand  side  of  the  Virial  theorem  contains  the  force  F).  For 
an  ideal  gas  it  is  assumed  that  there  are  no  interaction  forces  between  atoms,  that  is  the  only  force  is  the 
force  of  constraint  of  the  walls  of  the  pressure  vessel.  The  pressure  P is  force  per  unit  area  and  thus  the 
instantaneous  force  on  an  area  of  wall  dA  is  dF^  = —iiPdA  where  h designates  the  unit  vector  normal  to 
the  surface.  Thus  the  right-hand  side  of  the  Virial  theorem  is 

~\ Fi  ' r? = "T  / ' VidA 

Use  of  the  divergence  theorem  thus  gives  that  f n-r;cL4  = f V • rdV  = 3 / dV  = 3V.  Thus  the  Virial  theorem 
leads  to  the  ideal  gas  law,  that  is 

NkT  = PV 


2.6  Example:  The  mass  of  galaxies 

The  Virial  theorem  can  be  used  to  make  a crude  estimate  of  the  mass  of  a cluster  of  galaxies.  Assuming  a 
spherically-symmetric  cluster  of  N galaxies,  each  of  mass  m,  then  the  total  mass  of  the  cluster  is  M = Nm. 
A crude  estimate  of  the  cluster  potential  energy  is 


(U) 


GM 2 
R 


(a) 


where  R is  the  radius  of  a cluster.  The  average  kinetic  energy  per  galaxy  is  \m(v )2  where  ( v )2 
square  of  the  galaxy  velocities  with  respect  to  the  center  of  mass  of  the  cluster.  Thus  the  total 
of  the  cluster  is 


(KE) 


Nm  (v)2 
2 


M ( v )2 
2 


is  the  average 
kinetic  energy 

(/?) 


The  Virial  theorem  tells  us  that  a central  force  having  a radial  dependence  of  the  form  F oc  rn  gives  (KE)  = 
r1^-  (U).  For  the  inverse-square  gravitational  force  then 


(KE)  = -l-(U) 


(7) 


Thus  equations  a,  j3  and  7 give  an  estimate  of  the  total  mass  of  the  cluster  to  be 


M 


R(v)2 

G 


This  estimate  is  larger  than  the  value  estimated  from  the  luminosity  of  the  cluster  implying  a large  amount 
of  "dark  matter"  must  exist  in  galaxies  which  remains  an  open  question  in  physics. 
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2.12  Applications  of  Newton’s  equations  of  motion 


Newton’s  equation  of  motion  can  be  written  in  the  form 


F 


rip  dv  d~  r 

— = TO—  = TO— “TT 

dt  dt  dt 2 


(2.90) 


A description  of  the  motion  of  a particle  requires  a solution  of  this  second-order  differential  equation  of 
motion.  This  equation  of  motion  may  be  integrated  to  find  r(f)  and  v(f)  if  the  initial  conditions  and 
the  force  field  F(t)  are  known.  Solution  of  the  equation  of  motion  can  be  complicated  for  many  practical 
examples,  but  there  are  various  approaches  to  simplify  the  solution.  It  is  of  value  to  learn  efficient  approaches 
to  solving  problems. 

The  following  sequence  is  recommended 

a)  Make  a vector  diagram  of  the  problem  indicating  forces,  velocities,  etc. 

b)  Write  down  the  known  quantities. 

c)  Before  trying  to  solve  the  equation  of  motion  directly,  look  to  see  if  a basic  conservation  law  applies. 
That  is,  check  if  any  of  the  three  first-order  integrals,  can  be  used  to  simplify  the  solution.  The  use  of 
conservation  of  energy  or  conservation  of  momentum  can  greatly  simplify  solving  problems. 

The  following  examples  show  the  solution  of  typical  types  of  problem  encountered  using  Newtonian 
mechanics. 


2.12.1  Constant  force  problems 

Problems  having  a constant  force  imply  constant  acceleration.  The  classic  example  is  a block  sliding  on  an 
inclined  plane,  where  the  block  of  mass  to  is  acted  upon  by  both  gravity  and  friction.  The  net  force  F is 
given  by  the  vector  sum  of  the  gravitational  force  F3,  normal  force  N and  frictional  force  f/. 

F = Fs  + N + ff  = ma  (2.91) 

Taking  components  perpendicular  to  the  inclined  plane  in  the  y direction 


-Fq  cos  9 + N = 0 


That  is,  since  F„  = mg, 


N = mg  cos  9 


(2.93) 


Similarly,  taking  components  along  the  inclined  plane  in  the  x di- 
rection 

„ . a t d<lx 

Fg  Sill  9 — ff  = TO-^2" 

Using  the  concept  of  coefficient  of  friction  /j, 

ff  = RN 

Thus  the  equation  of  motion  can  be  written  as 

cFx 

mg  (sin  9 — n cos  9)  = 


(2.94) 

(2.95) 

(2.96) 


The  block  accelerates  if  sin#  > /icos  9,  that  is,  tan#  > g.  The 
acceleration  is  constant  if  g and  9 are  constant,  that  is 


dr  x 

—r  = g (sin  9 — g,  cos  9) 
dt - 


(2.97) 


(2.92) 


Figure  2.3:  Block  on  an  inclined  plane 


Remember  that  if  the  block  is  stationary,  the  friction  coefficient  balances  such  that  (sin  9 — p,  cos  9)  = 0, 
that  is,  tan#  = g.  However,  there  is  a maximum  static  friction  coefficient  fxs  beyond  which  the  block  starts 
sliding.  The  kinetic  coefficient  of  friction  gK  is  applicable  for  sliding  friction  and  usually  jiK  < ys. 

Another  example  of  constant  force  and  acceleration  is  motion  of  objects  free  falling  in  a uniform  gravi- 
tational field  when  air  drag  is  neglected.  Then  one  obtains  the  simple  relations  such  as  v = u + at,  etc. 
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2.12.2  Linear  Restoring  Force 


An  important  class  of  problems  involve  a linear  restoring  force,  that  is,  they  obey  Hooke’s  law.  The  equation 
of  motion  for  this  case  is 


F(x)  = — kx  = mx 


(2.98) 


It  is  usual  to  define 


Then  the  equation  of  motion  then  can  be  written  as 


(2.99) 


x + loqX  = 0 


(2.100) 


which  is  the  equation  of  the  harmonic  oscillator.  Examples  are  small  oscillations  of  a mass  on  a spring, 
vibrations  of  a stretched  piano  string,  etc. 

The  solution  of  this  second  order  equation  is 


x(t)  = A sin  (u>ot  — S) 


(2.101) 


This  is  the  well  known  sinusoidal  behavior  of  the  displacement  for  the  simple  harmonic  oscillator.  The 
angular  frequency  wq  is 


U)q  — 


(2.102) 


Note  that  for  this  linear  system  with  no  dissipative  forces,  the  total  energy  is  a constant  of  motion  as 
discussed  previously.  That  is,  it  is  a conservative  system  with  a total  energy  E given  by 


i mx 2 + ^kx2  = E (2.103) 


The  first  term  is  the  kinetic  energy  and  the  second  term  is  the  potential  energy.  The  Virial  theorem  gives 
that  for  the  linear  restoring  force  the  average  kinetic  energy  equals  the  average  potential  energy. 


2.12.3  Position-dependent  conservative  forces 

The  linear  restoring  force  is  an  example  of  a conservative  field.  The  total  energy  E is  conserved,  and  if  the 
field  is  time  independent,  then  the  conservative  forces  are  a function  only  of  position.  The  easiest  way  to 
solve  such  problems  is  to  use  the  concept  of  potential  energy  U illustrated  in  Figure  2.4. 

U2-U1  = - j F • dr  (2.104) 

Consider  a conservative  force  in  one  dimension.  Since  it  was  shown  that  the  total  energy  E = T + U is 
conserved  for  a conservative  field,  then 


E=T+U= 

^mv2  + U(x) 

(2.105) 

Therefore: 

1 

dX  _L  1 

dt  V 

— \E  — U{x)\ 

(2.106) 

Integration  of  this  gives 

r 

Fdx 

t-to=  - 

(2.107) 

Jx 0 V 

/£[E-U(x)] 

where  x = xq  when  t = to-  Knowing  U(x)  it  is  possible  to  solve  this  equation  as  a function  of  time. 
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It  is  possible  to  understand  the  general  features  of  the 
solution  just  from  inspection  of  the  function  U(x).  For  ex- 
ample, as  shown  in  figure  2.4  the  motion  for  energy  E\ 
is  periodic  between  the  turning  points  xa  and  Xb-  Since 
the  potential  energy  curve  is  approximately  parabolic  be- 
tween these  limits  the  motion  will  exhibit  simple  harmonic 
motion.  For  Eq  the  turning  point  coalesce  to  Xq.  that  is 
there  is  no  motion.  For  total  energy  E2  the  motion  is 
periodic  in  two  independent  regimes,  xc  < x < Xd,  and 
xe  < x < Xf.  Classically  the  particle  cannot  jump  from 
one  pocket  to  the  other.  The  motion  for  the  particle  with 
total  energy  E3  is  that  it  moves  freely  from  infinity,  stops 
and  rebounds  at  x = xg  and  then  returns  to  infinity.  That 
is  the  particle  bounces  off  the  potential  at  xg.  For  energy 
E4  the  particle  moves  freely  and  is  unbounded.  For  all 
these  cases,  the  actual  velocity  is  given  by  the  above  re- 
lation for  v (x) . Thus  the  kinetic  energy  is  largest  where 
the  potential  is  deepest.  An  example  would  be  motion  of 
a roller  coaster  car. 

Position-dependent  forces  are  encountered  extensively 
in  classical  mechanics.  Examples  are  the  many  manifesta- 
tions of  motion  in  gravitational  fields,  such  as  interplane- 
tary probes,  a roller  coaster,  and  automobile  suspension  systems.  The  linear  restoring  force  is  an  especially 
simple  example  of  a position-dependent  force  while  the  most  frequently  encountered  conservative  potentials 
are  in  electrostatics  and  gravitation  for  which  the  potentials  are; 

U(r)  = — y~—  (Electrostatic  potential  energy) 


Figure  2.4:  One-dimensional  potential  U(x). 


U(r)  = -G 


47re0  rf2 
17111712 


(Gravitational  potential  energy) 
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Knowing  U{r)  it  is  possible  to  solve  the  equation  of  motion  as  a function  of  time. 

2.7  Example:  Diatomic  molecule 

An  example  of  a conservative  field  is  a vibrating  diatomic  molecule  which  has  a potential  energy  depen- 
dence with  separation  distance  x that  is  described  approximately  by  the  Morse  function 

Q-^o)  1 2 


U(x)  = U0 


1 — e 


— Uq 


(x~xo) 


(x- rrp) 


where  Uo,xo,  and  5 are  parameters  chosen  to  best  describe  the  particular  pair  of  atoms.  The  restoring  force 
is  given  by 

dx  0 1 

This  has  a minimum  value  of  U(xo)  = Uo  at  x = xo- 
Note  that  for  small  amplitude  oscillations,  where 

(x  — xo)  « 6 

the  exponential  term  in  the  potential  function  can  be  ex- 
panded to  give 

1 2 


U(x)  « U0 


1-(1 


(x  - Xo)  , 


Uo 


%x-x0  )2-Uo 
0 


U» 


This  gives  a restoring  force 
dU(x) 


F(x) 


dx 


-2^j-(x-x0) 


That  is,  for  small  amplitudes  the  restoring  force  is  linear. 


Potential  energy  function  U(x)/U0  versus  x/5 
for  the  diatomic  molecule. 
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2.12.4  Constrained  motion 

A frequently  encountered  problem  with  position  dependent  forces  is  when  the  motion  is  constrained  to 
follow  a certain  trajectory.  Forces  of  constraint  must  exist  to  constrain  the  motion  to  a specific  trajectory. 
Examples  are,  the  roller  coaster,  a rolling  ball  on  an  undulating  surface,  or  a downhill  skier,  where  the 
motion  is  constrained  to  follow  the  surface  or  track  contours.  The  potential  energy  can  be  evaluated  at  all 
positions  along  the  constrained  trajectory  for  conservative  forces  such  as  gravity.  However,  the  additional 
forces  of  constraint  that  must  exist  to  constrain  the  motion,  can  be  complicated  and  depend  on  the  motion. 
For  example,  the  roller  coaster  must  always  balance  the  gravitational  and  centripetal  forces.  Fortunately 
forces  of  constraint  Fc  often  are  normal  to  the  direction  of  motion  and  thus  do  not  contribute  to  the  total 
mechanical  energy  since  then  the  work  done  Fc  • dl  is  zero.  Magnetic  forces  F =qv  x B exhibit  this  feature 
of  having  the  force  normal  to  the  motion. 

Solution  of  constrained  problems  is  greatly  simplified  if  the  other  forces  are  conservative  and  the  forces 
of  constraint  are  normal  to  the  motion,  since  then  energy  conservation  can  be  used. 

2.8  Example:  Roller  coaster 

Consider  motion  of  a roller  coaster  shown  in  the 
adjacent  figure.  This  system  is  conservative  if  the  fric- 
tion and  air  drag  are  neglected  and  then  the  forces  of 
constraint  are  normal  to  the  direction  of  motion. 

The  kinetic  energy  at  any  position  is  just  given  by 
energy  conservation  and  the  fact  that 

E = T+U 

where  U depends  on  the  height  of  the  track  at  any  the 

given  location.  The  kinetic  energy  is  greatest  when  the 

potential  energy  is  lowest.  The  forces  of  constraint 

can  be  deduced  if  the  velocity  of  motion  on  the  track 

is  known.  Assuming  that  the  motion  is  confined  to  a 

vertical  plane,  then  one  has  a centripetal  force  of  con- 
2 

straint  normal  to  the  track  inwards  towards  the 
center  of  the  radius  of  curvature  p,  plus  the  gravita- 
tion force  downwards  of  mg. 

2 

The  constraint  force  is  '-AUx.  _ mg  UpWar(js  af  the 

2 

top  of  the  loop,  while  it  is  mg  downwards  at 

the  bottom  of  the  loop.  To  ensure  that  the  car  and 
occupants  do  not  leave  the  required  trajectory,  the  force 
upwards  at  the  top  of  the  loop  has  to  be  positive,  that 
is,  v\  > pg.  The  velocity  at  the  bottom  of  the  loop 
is  given  by  = \mvf  + 2 mgp  assuming  that  the 

track  has  a constant  radius  of  curvature  p.  That  is; 
at  a minimum  v\  = pg  + 4 pg  = 5 pg.  Therefore  the 
occupants  now  will  feel  an  acceleration  downwards  of  Roller  coaster  (CC0  Public  Domain) 

at  least  ^ + g = 6g  at  the  bottom  of  the  loop.  The 

first  roller  coaster  was  built  with  such  a constant  radius  of  curvature  but  an  acceleration  of  6 g was  too  much 
for  the  average  passenger.  Therefore  roller  coasters  are  designed  such  that  the  radius  of  curvature  is  much 
larger  at  the  bottom  of  the  loop,  as  illustrated,  in  order  to  maintain  sufficiently  low  g loads  and  also  ensure 
that  the  required  constraint  forces  exist. 

Note  that  the  minimum  velocity  at  the  top  of  the  loop,  vt,  implies  that  if  the  cart  starts  from  rest  it  must 
start  at  a height  h ^ § above  the  top  of  the  loop  if  friction  is  negligible.  Note  that  the  solution  for  the  rolling 
ball  on  such  a roller  coaster  differs  from  that  for  a sliding  object  since  one  must  include  the  rotational  energy 
of  the  ball  as  well  as  the  linear  velocity. 

Looping  the  loop  in  a glider  involves  the  same  physics  making  it  necessary  to  vary  the  elevator  control  to 
vary  the  radius  of  curvature  throughout  the  loop  to  minimize  the  maximum  g load. 
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2.12.5  Velocity  Dependent  Forces 

Velocity  dependent  forces  are  encountered  frequently  in  practical  problems.  For  example,  motion  of  an 
object  in  a fluid,  such  as  air,  where  viscous  forces  retard  the  motion.  In  general  the  retarding  force  has  a 
complicated  dependence  on  velocity.  The  drag  force  usually  is  expressed  in  terms  of  a drag  coefficient  Cd, 

F D(v)  = ~^cDpAv2v  (2.108) 

where  cp  is  a dimensionless  drag  coefficient,  p is  the  density  of  air,  A is  the  cross  sectional  area  perpendicular 
to  the  direction  of  motion,  and  v is  the  velocity.  Modern  automobiles  have  drag  coefficients  as  low  as  0.3.  As 
described  in  chapter  4,  the  drag  coefficient  cd  depends  on  the  Reynold’s  number  which  relates  the  inertial  to 
viscous  drag  forces.  Small  sized  objects  at  low  velocity,  such  as  light  raindrops,  have  low  Reynold’s  numbers 
for  which  Cc > is  roughly  proportional  to  u-1  leading  to  a linear  dependence  of  the  drag  force  on  velocity,  i.e. 
Fd(v)  oc  v.  Larger  objects  moving  at  higher  velocities,  such  as  a car  or  sky-diver,  have  higher  Reynold’s 
numbers  for  which  cd  is  roughly  independent  of  velocity  leading  to  a drag  force  Fo(v)  oc  v2 . This  drag  force 
always  points  in  the  opposite  direction  to  the  unit  velocity  vector.  Approximately  for  air 

Ffl(v)  = - (ciu  + c2v2)  v (2.109) 

where  for  spherical  objects  of  diameter  D , ci  « 1.55  x 10~4D  and  C2  s=s  0.22D2  in  MKS  units.  Fortunately,  the 
equation  of  motion  usually  can  be  integrated  when  the  retarding  force  has  a simple  power  law  dependence. 
As  an  example,  consider  free  fall  in  the  Earth’s  gravitational  field. 

2.9  Example:  Vertical  fall  in  the  earth’s  gravitational  field. 

Linear  regime  c\  » c^v 

For  small  objects  at  low-velocity,  i.e.  low  Reynold’s  number,  the  drag  has  approximately  a linear  depen- 
dence on  velocity.  The  equation  of  motion  is 


dv 

— mg  — c\v  = m — 


Separate  the  variables  and  integrate 


t = 


mdv 

— mg  — cpv 


m / mg  + c\v  \ 
ci  \mg  + avo) 


That  is 


mq 

v = 


Cl 


Note  that  for  t the  velocity  approaches  a terminal  velocity  of  Voo 

constant  is  r = Note  that  if  vq  = 0,  then 


— mg.  y/jg  characteristic  time 


v = v00 


_ t_ 

e t 


For  the  case  of  small  raindrops  with  D = 0.5mm,  then  v ^ = 8 m/s  (18mph)  and  time  constant  r = 0.8  sec. 
Note  that  in  the  absence  of  air  drag,  these  rain  drops  falling  from  2000 m would  attain  a velocity  of  over 
400  m.p.h.  It  is  fortunate  that  the  drag  reduces  the  speed  of  rain  drops  to  non-damaging  values.  Note  that 
the  above  relation  would  predict  high  velocities  for  hail.  Fortunately,  the  drag  increases  quadratically  at  the 
higher  velocities  attained  by  large  rain  drops  or  hail,  and  this  limits  the  terminal  velocity  to  moderate  values. 
As  known  in  the  mid-west,  these  velocities  still  are  sufficient  to  do  considerable  crop  damage. 

Quadratic  regime  c^v  » c\ 

For  larger  objects  at  higher  velocities,  i.e.  high  Reynold’s  number,  the  drag  depends  on  the  square  of  the 
velocity  making  it  necessary  to  differentiate  between  objects  rising  and  falling.  The  equation  of  motion  is 


2 uv 
— mg  ± C2V  = to— 
at 
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where  the  positive  sign  is  for  falling  objects  and  negative  sign  for  rising  objects.  Integrating  the  equation  of 
motion  for  falling  gives 


t 


mdv 


-mg  + C2V- 


i*  - tanh-1  JL) 

Voo  ^00  / 


That  is,  t = ^.  For  the  case  of  a falling  object  with  Vo  = 0,  solving  for 

, t 

v = Vao  tanh  — 

T 

As  an  example,  a 0.6 kg  basket  ball  with  D = 0.25m  will  have  Voo  = 20 m/s  ( 43  m.p.h.)  and  r = 2.1sec. 

Consider  President  George  H.  W.  Bush  skydiving.  Assume  his  mass  is  10kg  and  assume  an  equivalent 
spherical  shape  of  the  former  President  to  have  a diameter  of  D = 1 m.  This  gives  that  = 56m/ s 
/120 mph)  and  r = 5.6sec.  When  Bush  senior  opens  his  8m  diameter  parachute  his  terminal  velocity  is 
estimated  to  decrease  to  7m/ s /15  mph)  which  is  close  to  the  value  for  a typical  (8m)  diameter  emergency 
parachute  which  has  a measured  terminal  velocity  of  11  mph  in  spite  of  air  leakage  through  the  central  vent 
needed  to  provide  stability. 


where  T = and  v ^ = 


velocity  gives 


2.10  Example:  Projectile  motion  in  air 

Consider  a projectile  initially  at  x = y = 0 at  t = 0,  that  is  fired  at  an  initial  velocity  vq  at  an  angle 
0 to  the  horizontal.  In  order  to  understand  the  general  features  of  the  solution,  assume  that  the  drag  is 
proportional  to  velocity.  This  is  incorrect  for  typical  projectile  velocities,  but  simplifies  the  mathematics.  The 
equations  of  motion  can  be  expressed  as 

mi  = —kmx 


my  = — kmy  — mg 

where  k is  the  coefficient  for  air  drag.  Take  the  initial  conditions  at  t = 0 to  be  x = y = 0,  x = u0cos  6, 
y = vQ  sind. 

Solving  in  the  x coordinate, 

dx  , . 

— = —kx 
dt 


Therefore 


x = v0  cos  6e 


—kt 


That,  is,  the  velocity  decays  to  zero  with  a time  constant  r = 
Integration  of  the  velocity  equation  gives 


x=Vj(l- e~kt) 

Note  that  this  implies  that  the  body  approaches  a value  of  x = ^ as  t — > 00. 

The  trajectory  of  an  object  is  distorted  from  the  parabolic  shape,  that  occurs  for  k = 0,  due  to  the  rapid 
drop  in  range  as  the  drag  coefficient  increases.  For  realistic  cases  it  is  necessary  to  use  a computer  to  solve 
this  numerically. 

2.12.6  Systems  with  Variable  Mass 

Classic  examples  of  systems  with  variable  mass  are  the  rocket,  nuclear  fission  and  other  modes  of  nuclear 
decay. 

Consider  the  problem  of  rocket  motion  in  a gravitational  field.  When  there  is  a vertical  gravitational 
external  field  the  vertical  momentum  is  not  conserved  due  to  both  gravity  and  the  ejection  of  rocket  propel- 
lant. In  a time  dt  the  rocket  ejects  propellant  dmp  with  exhaust  velocity  relative  to  the  rocket  of  it.  Thus 
the  momentum  imparted  to  this  propellant  is 


dpp  = — udmp 


(2.110) 
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Therefore  the  rocket  is  given  an  equal  and  opposite  increase  in  momentum  dpn 

dpR  = +udmp  (2.111) 

In  the  time  interval  dt  the  net  change  in  the  linear  momentum  of  the  rocket  plus  fuel  system  is  given  by 

dp  = (m  — dmp)(v  + dv ) + dmp(v  — u)  — mv  = mdv  — udmp  (2.112) 

The  rate  of  change  of  the  linear  momentum  thus  equals 

dp  dv  dmv 

lex  = — = to— u — — £- 

dt  dt  dt 

Consider  the  problem  for  the  special  case  of  vertical  ascent  of  the  rocket  against  the  external  gravitational 
force  Fex  = —mg.  Then 

dm„  dv 


—mq  + u — — 1 - = to— 
y dt  dt 

This  can  be  rewritten  as 

—mg  + urhp  = mv 

The  second  term  comes  from  the  variable  mass.  But  the 
loss  of  mass  of  the  rocket  equals  the  mass  of  the  ejected 
propellant.  Assuming  a constant  fuel  burn  mp  = a then 


(2.113) 

(2.114) 


to  = — TOp  = —a 

(2.115) 

where  a > 0. 

Then  the  equation  becomes 

dv  = \ —g  + — u ) dt 

V TO  / 

(2.116) 

Since 

dm 

—r~  = —ot 

dt 

(2.117) 

then 

dm 

=dt 

a 

(2.118) 

Inserting  this 

in  the  above  equation  gives 

dv=(9--  “ ' ) dm 
\a  mJ 

(2.119) 

Figure  2.5:  Vertical  motion  of  a rocket  in  a 
gravitational  field 


Integration  gives 


But  the  change  in  mass  is  given  by 


That  is 


v = ——  (mo  — m)  + ztln 

a V to  / 


dm  = — a 


mo  — to  = at 

. , , fm0\ 

v = — gt  + tiln  — 

V to  / 


(2.120) 

(2.121) 

(2.122) 

(2.123) 


Note  that  once  the  propellant  is  exhausted  the  rocket  will  continue  to  fly  upwards  as  it  decelerates  in  the 
gravitational  field.  You  can  easily  calculate  the  maximum  height.  Note  that  this  formula  assumes  that  the 
acceleration  due  to  gravity  is  constant  whereas  for  large  heights  above  the  Earth  it  is  necessary  to  use  the 
true  gravitational  force  —G^p.  where  r is  the  distance  from  the  center  of  the  earth.  In  real  situations  it  is 
necessary  to  include  air  drag  which  requires  use  of  a computer  to  numerically  solve  the  equations  of  motion. 
The  highest  rocket  velocity  is  attained  by  maximizing  the  exhaust  velocity  and  the  ratio  of  initial  to  final 
mass.  Because  the  terminal  velocity  is  limited  by  the  mass  ratio,  engineers  construct  multistage  rockets  that 
jettison  the  spent  fuel  containers  and  rockets. 
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2.12.7  Rigid-body  rotation  about  a body-fixed  rotation  axis 

The  most  general  case  of  rigid-body  rotation  involves  rotation  about  some  body-fixed  point  with  the  orien- 
tation of  the  rotation  axis  undefined.  For  example,  an  object  spinning  in  space  will  rotate  about  the  center 
of  mass  with  the  rotation  axis  having  any  orientation.  Another  example  is  a child’s  spinning  top  which  spins 
with  arbitrary  orientation  of  the  axis  of  rotation  about  the  pointed  end  which  touches  the  ground  about  a 
static  location.  Such  rotation  about  a body-fixed  point  is  complicated  and  will  be  discussed  in  chapter  11. 
Rigid-body  rotation  is  easier  to  handle  if  the  orientation  of  the  axis  of  rotation  is  fixed  with  respect  to  the 
rigid  body.  An  example  of  such  motion  is  a hinged  door. 

For  a rigid  body  rotating  with  angular  velocity  ui,  the  total  angular  momentum  L is  given  by 

n n 

L = ^2  Li  = ^ r i x Pi  (2.124) 

i i 

For  rotation  equation  appendix  D 29  gives 

v?:  = w x i \ (2.94) 

thus  the  angular  momentum  can  be  written  as 

n n 

L = ^^r.j  x pi  = rriiXi  xwxr  j (2.125) 

i i 

This  can  be  simplified  using  the  vector  identity  equation  B.  24  giving 

n 

L = ^2  [( miri ) w - (ri  • u})miri]  (2.126) 


Rigid-body  rotation  about  a body-fixed  symmetry  axis 

The  simplest  case  for  rigid-body  rotation  is  when  the  body  has  a symmetry  axis  with  the  angular  velocity  uj 
parallel  to  this  body-fixed  symmetry  axis.  For  this  case  then  r?;  can  be  taken  perpendicular  to  u>,  for  which 
the  second  term  in  equation  2.126,  i.e.  (rt  • uj)  =0,  thus 

n 

L Sym  = ^ (miri)  u (ri  perpendicular  to  u>) 

i 

The  moment  of  inertia  about  the  symmetry  axis  is  defined  as 

n 

I sym  = ^2  rrurl  (2.127) 

i 

where  is  the  perpendicular  distance  from  the  axis  of  rotation  to  the  body,  r>v  For  a continuous  body  the 
moment  of  inertia  can  be  generalized  to  an  integral  over  the  mass  density  p of  the  body 

Isym  = J pr2dV  (2.128) 

where  r is  perpendicular  to  the  rotation  axis.  The  definition  of  the  moment  of  inertia  allows  rewriting  the 
angular  momentum  about  a symmetry  axis  L sym  in  the  form 

sym  = IsymLd  (2.129) 

where  the  moment  of  inertia  Isym  is  taken  about  the  symmetry  axis  and  assuming  that  the  angular  velocity 
of  rotation  vector  is  parallel  to  the  symmetry  axis. 
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Rigid-body  rotation  about  a non-symmetric  body-fixed  axis 

In  general  the  fixed  axis  of  rotation  is  not  aligned  with  a symmetry  axis  of  the  body,  or  the  body  does  not 
have  a symmetry  axis,  both  of  which  complicate  the  problem. 

For  illustration  consider  that  the  rigid  body  comprises  a system  of  n masses  to,  located  at  positions  r j, 
with  the  rigid  body  rotating  about  the  z axis  with  angular  velocity  U3.  That  is, 

=lozz 

In  cartesian  coordinates  the  fixed-frame  vector  for  particle  i is 

r;  = ( Xi,yi,Zi ) 

using  these  in  the  cross  product  (2.94)  gives 

/ -UzVi  \ 

Vi  = U3  X r;  = UzXi 

V 0 J 

which  is  written  as  a column  vector  for  clarity.  Inserting  V;  in  the  cross-product  rz  x V;  gives  the  components 
of  the  angular  momentum  to  be 


(2.130) 


(2.131) 


(2.132) 


n ^ l Zi%i  \ 

b = miYi  X V;  = ^2  ~ZiVi 

i i V Xi  + Vi  J 


That  is,  the  components  of  the  angular  momentum  are 


Lx  — ^ mjZiXij  c oz  — Ixz^z  (2.133) 

Ly  = ' ‘HliZ'i'yi^  U)z  = IyZUJz 

Lz  = rrii  [x  1 + y\\  w2  = Izzujz 

Note  that  the  perpendicular  distance  from  the  2 axis 
in  cylindrical  coordinates  is  p = \fx?  + yf,  thus  the  an- 
gular momentum  Lz  about  the  z axis  can  be  written 
as 


y^rriip2  j ioz  = Izzu)z 


(2.134) 


Z 


where  (2.134)  gives  the  elementary  formula  for  the  mo- 
ment of  inertia  Izz  = Isyrn  about  the  z axis  given  earlier 
in  (2.129). 

The  surprising  result  is  that  Lx  and  Ly  are  non-zero 
implying  that  the  total  angular  momentum  vector  L is 
in  general  not  parallel  with  u>.  This  can  be  understood 
by  considering  the  single  body  m shown  in  figure  2.6. 
When  the  body  is  in  the  y,  z plane  then  x = 0 and 
Lx  — 0.  Thus  the  angular  momentum  vector  L has  a 
component  along  the  —y  direction  as  shown  which  is 
not  parallel  with  m and,  since  the  vectors  m,L,r,j  are 
coplanar,  then  L must  sweep  around  the  rotation  axis 
about  the  2 axis.  Instantaneously  the  velocity  of  the 


Figure  2.6:  A rigid  rotating  body  comprising  a sin- 
gle mass  to  attached  by  a massless  rod  at  a fixed 
angle  a shown  at  the  instant  when  to  happens  to 
lie  in  the  yz  plane.  As  the  body  rotates  about 
the  2—  axis  the  mass  to  has  a velocity  and  mo- 
mentum into  the  page  (the  negative  x direction). 
Therefore  the  angular  momentum  L = r x p is  in 
the  direction  shown  which  is  not  parallel  to  the 
angular  velocity  c o. 

3 to  remain  coplanar  with  the  body  as  it  rotates 
ody  v,;  is  into  the  plane  of  the  paper  and,  since 
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Lj  = niiYi  x v,:,  then  L,  is  at  an  angle  (90°  — a)  to  the  z axis.  This  implies  that  a torque  must  be  applied 
to  rotate  the  angular  momentum  vector.  This  explains  why  your  automobile  shakes  if  the  rotation  axis  and 
symmetry  axis  are  not  parallel  for  one  wheel. 

The  first  two  moments  in  (2.133)  are  called  products  of  inertia  of  the  body  designated  by  the  pair  of 
axes  involved.  Therefore,  to  avoid  confusion,  it  is  necessary  to  define  the  diagonal  moment,  which  is  called 
the  moment  of  inertia,  by  two  subscripts  as  Izz.  Thus  in  general,  a body  can  have  three  moments  of  inertia 
about  the  three  axes  plus  three  products  of  inertia.  This  group  of  moments  comprise  the  inertia  tensor 
which  will  be  discussed  further  in  chapter  11.  If  a body  has  an  axis  of  symmetry  along  the  2 axis  then  the 
summations  will  give  Ixz  = Iyz  = 0 while  Izz  will  be  unchanged.  That  is,  for  rotation  about  a symmetry 
axis  the  angular  momentum  and  rotation  axes  are  parallel.  For  any  axis  along  which  the  angular  momentum 
and  angular  velocity  coincide  is  called  a principal  axis  of  the  body. 

2.11  Example:  Moment  of  inertia  of  a thin  door 

Consider  that  the  door  has  width  a and  height  b and  assume  the  door  thickness  is  negligible  with  areal 
density  crkg/m2.  Assume  that  the  door  is  hinged  about  the  y axis.  The  mass  of  a surface  element  of 
dimension  dx.dy  at  a distance  x from  the  rotation  axis  is  dm  = adxdy,  thus  the  mass  of  the  complete  door 
is  M = crab.  The  moment  of  inertia  about  the  y axis  is  given  by 

ra  rb  i,i 

I = / ax2dydx  = - aba 3 = -Ma2 

Jx= o Jy=o  1 1 


2.12  Example:  Merry-go-round 

A child  of  mass  m jumps  onto  the  outside  edge  of  a circular  merry-go-round  of  moment  of  inertia  I , and 
radius  R and  initial  angular  velocity  u>o.  What  is  the  final  angular  velocity  uif? 

If  the  initial  angidar  momentum  is  L0  and,  assuming  the  child  jumps  with  zero  angidar  velocity,  then  the 
conservation  of  angular  momentum  implies  that 

L0 

Two 


That  is 

Vj_  __  _ 1 

t>o  u>o  I + mR2 

Note  that  this  is  true  independent  of  the  details  of  the  acceleration  of  the  initially  stationary  child. 

2.13  Example:  Cue  pushes  a billiard 

Consider  a billiard  ball  of  mass  M and  radius  R 
is  pushed  by  a cue  in  a direction  that  passes  through 
the  center  of  gravity  such  that  the  ball  attains  a veloc- 
ity vo  ■ The  friction  coefficient  between  the  table  and 
the  ball  is  y.  How  far  does  the  ball  move  before  the 
initial  slipping  motion  changes  to  pure  rolling  mo- 
tion? 

Since  the  direction  of  the  cue  force  passes  through 
the  center  of  mass  of  the  ball,  it  contributes  zero 
torque  to  the  ball.  Thus  the  initial  angular  momen- 
tum is  zero  at  t = 0.  The  friction  force  f points  opposite  to  the  direction  of  motion  and  causes  a torque  Ns 
about  the  center  of  mass  in  the  direction  s. 


ball 


Cue  pushing  a billiard  ball  horizontally  at  the  height 
of  the  centre  of  rotation  of  the  ball. 


= Lf 

= Iw+mvfR 
= I + mR 2) 


Ns  = f • R =f. iMgR 
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Since  the  moment  of  inertia  about  the  center  of  a uniform  sphere  is  I = |Afi?2  then  the  angular  acceleration 
of  the  ball  is 

pMgR.  /jMgR  5 fig 

{a) 


to 


l MR2  2 It 

5 


Moreover  the  frictional  force  causes  a deceleration  as  of  the  linear  velocity  of  the  center  of  mass  of 


Integrating  a from  time  zero  to  t gives 


as  = -{d=-M 


[■  5 ng 

W = /0  udt  = 2 R* 


The  linear  velocity  of  the  center  of  mass  at  time  t is  given  by  integration  of  equation  ft 


W) 


vs=  asdt  = vq  — ngt 


Jo 


The  billiard  ball  stops  sliding  and  only  rolls  when  vs  = ujR,  that  is,  when 


That  is,  when 

Thus  the  ball  slips  for  a distance 


5 pq  „ 

2^ftR  = vo~ 


, _ 2 vo_ 

^ Toll  ry 

7 pg 


. _ [tr°U  m _ „ , Mtloll  ^ 12  v0 

s - I Vsdt  - Votroll  2 - 49 


^0 


Note  that  if  the  ball  is  pushed  at  a distance  h above  the  center  of  mass,  besides  the  linear  velocity  there 
is  an  initial  angular  momentum  of 

Mv oh  5 Vq h 


U!  = 


l MR2  2 R2 

5 


For  the  case  h = | R then  the  ball  immediately  assumes  a pure  non-slipping  roll.  For  h < | R one  has 
w < ^ while  h > | R corresponds  to  u > In  the  latter  case  the  frictional  force  points  forward. 


2.12.8  Time  dependent  forces 

Many  problems  involve  action  in  the  presence  of  a time  dependent  force.  There  are  two  extreme  cases  that 
are  often  encountered.  One  is  an  impulsive  force  that  acts  for  a very  short  time,  for  example,  striking  a ball 
with  a bat,  or  the  collision  of  two  cars  while  the  second  force  is  an  oscillatory  time  dependent  force.  The 
response  to  impulsive  forces  is  discussed  below  whereas  the  response  to  oscillatory  time  dependent  forces  is 
discussed  in  chapter  3. 


Translational  impulsive  forces 

An  impulsive  force  acts  for  a very  short  time  relative  to  the  response  time  of  the  mechanical  system  being 
discussed.  In  principle  the  equation  of  motion  can  be  solved  if  the  complicated  time  dependence  of  the  force, 
F(t),  is  known.  However,  often  it  is  possible  to  use  the  much  simpler  approach  employing  the  concept  of  an 
impulse  and  the  principle  of  the  conservation  of  linear  momentum. 

Define  the  linear  impulse  to  be  the  first-order  time  integral  of  the  time-dependent  force. 


P = 


/ 


F(t)dt 


(2.135) 
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Since  F(i)  = -fj:  then  equation  2.135  gives  that 

P = J ^7 dt ' = j dp  = p (f)  - po  = Ap  (2.136) 

Thus  the  impulse  P is  an  unambiguous  quantity  that  equals  the  change  in  linear  momentum  of  the  object 
that  has  been  struck  which  is  independent  of  the  details  of  the  time  dependence  of  the  impulsive  force. 
Computation  of  the  spatial  motion  still  requires  knowledge  of  F(t)  since  the  2.136  can  be  written  as 


v(i)  = 


1 

m 


F (t')dt'  + v0 


(2.137) 


Integration  gives 


r(t)  ~ ro  = v0f  + 


F{t')dt' 


dt” 


(2.138) 


In  general  this  is  complicated, 
constant  acceleration  equation 


However,  for  the  case  of  a constant  force  F(t)  = Fq,  this  simplifies  to  the 


r (i)  - r0  = v0i  + t2 
2 m 


(2.139) 


where  the  constant  acceleration  a = ^ . 


Angular  impulsive  torques 

Note  that  the  principle  of  impulse  also  applies  to  angular  motion.  Define  an  impulsive  torque  as  the  first-order 
time  integral  of  the  time-dependent  torque. 


T = / N(i)dt 


Since  torque  is  related  to  the  rate  of  change  of  angular  momentum 


then 


/t  JT  ft 

— dt 1 = dL  = L (i)  - L0  - AL 


■Jo 


dt' 


Thus  the  impulsive  torque  T equals  the  change  in  angular  momentum  AL  of  the  struck  body. 


(2.140) 


(2.141) 

(2.142) 


2.14  Example:  Center  of  percussion  of  a baseball  bat 

When  an  impulsive  force  P strikes  a bat  of  mass  M at  a dis- 
tance s from  the  center  of  mass,  then  both  the  linear  momentum 
of  the  center  of  mass,  and  angular  momenta  about  the  center 
of  mass,  of  the  bat  are  changed.  Assume  that  the  ball  strikes 
the  bat  with  an  impulsive  force  P = A pbal1  perpendicular  to  the 
symmetry  axis  of  the  bat  at  the  strike  point  S which  is  a distance 
s from  the  center  of  mass  of  the  bat.  The  translational  impulse 
given  to  the  bat  equals  the  change  in  linear  momentum  of  the 
ball  as  given  by  equation  2.136  coupled  with  the  conservation  of 
linear  momentum 


P = Apbat  = M Avbat 

ircm  IU  V cm 


Similarly  equation  2.142  gives  that  the  angidar  impulse  T equals 
the  change  in  angidar  momentum  about  the  center  of  mass  to  be 


y 


T? 


o 


c 


M 


y 


X 


T=  s x P = AL  =IcmAucm 
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The  above  equations  give  that 


Av 


bat 

cm 


Au> 


bat 

cm 


p 

M 

s x P 

I cm 


Assume  that  the  bat  was  stationary  prior  to  the  strike,  then  after  the  strike  the  net  translational  velocity 
of  a point  O along  the  body-fixed  symmetry  axis  of  the  bat  at  a distance  y from  the  center  of  mass,  is  given 
by 

PI  PI 

v (y)  = Avcm  + Awcm  x y =—  + — ((s  x P)  x y)  = — + — [(s  • y)  P-  (s  • P)  y] 

1*1  1cm  1*1  1cm 

It  is  assumed  that  P and  s are  perpendicular  and  thus  (s  • P)  = 0 which  simplifies  the  above  equation  to 

v (y)  = Avcm  + Aucm  x y = — ^1  + — J 

Note  that  the  translational  velocity  of  the  location  O,  along  the  bat  symmetry  axis  at  a distance  y from  the 
center  of  mass,  is  zero  if  the  bracket  equals  zero,  that  is,  if 


where  kcm  is  called  the  radius  of  gyration  of  the  body  about  the  center  of  mass.  Note  that  when  the  scalar 
product  s ■ y = = —kfm  then  there  will  be  no  translational  motion  at  the  point  O.  This  point  on  the 

y axis  lies  on  the  opposite  side  of  the  center  of  mass  from  the  strike  point  S,  and  is  called  the  center  of 
percussion  corresponding  to  the  impulse  at  the  point  S.  The  center  of  percussion  often  is  referred  to  as  the 
"sweet  spot"  for  an  object  corresponding  to  the  impulse  at  the  point  S.  For  a baseball  bat  the  batter  holds 
the  bat  at  the  center  of  percussion  so  that  they  do  not  feel  an  impulse  in  their  hands  when  the  ball  is  struck 
at  the  point  S.  This  principle  is  used  extensively  to  design  bats  for  all  sports  involving  striking  a ball  with 
a bat,  such  as,  cricket,  squash,  tennis,  etc.  as  well  as  weapons  such  of  swords  and  axes  used  to  decapitate 
opponents. 


2.15  Example:  Energy  transfer  in  charged-particle  scattering 


Consider  a particle  of  charge  +ei  moving  with  very  high 
velocity  vq  along  a straight  line  that  passes  a distance  b from 
another  charge  +e2  and  mass  m.  Find  the  energy  Q trans- 
ferred to  the  mass  m during  the  encounter  assuming  the 
force  is  given  by  Coulomb's  law.  Since  the  charged  parti- 
cle e\  moves  at  very  high  speed  it  is  assumed  that  charge  2 
does  not  change  position  during  the  encounter.  Assume  that 
charge  1 moves  along  the  —y  axis  through  the  origin  while 
charge  2 is  located  on  the  x axis  at  x = b.  Let  us  consider 
the  impulse  given  to  charge  2 during  the  encounter.  By  sym- 
metry the  y component  must  cancel  while  the  x component 
is  given  by 


dpx  = Fxdt  = — 


eie2 

4neoF 


cos  ddt  = — 


eie2  adt 

» cost)— do 

47re0r2  d9 


But 


y 


where 


rd  = — vo  cos  0 

- = cos(7r  — 6)  = — cos  9 
r 
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Thus 


dpx  = — 6ief — cos  9d9 
47reo(Wo 


Integrate  from  ^ <9  gives  that  the  total  momentum  imparted  to  e2  is 


Px  


eie2 


Aneobvo  Jjl 

Thus  the  recoil  energy  of  charge  2 is  given  by 


cos  9d9  = 


eie2 

27T€obVo 


E2 


= pL  = JL(- 


eie2 


2 m 2m  \27reobvo J 


2.13  Solution  of  many-body  equations  of  motion 

The  following  are  general  methods  used  to  solve  Newton’s  many-body  equations  of  motion  for  practical 
problems. 

2.13.1  Analytic  solution 

In  practical  problems  one  has  to  solve  a set  of  equations  of  motion  since  the  forces  depend  on  the  location 
of  every  body  involved.  For  example  one  may  be  dealing  with  a set  of  coupled  oscillators  such  as  the 
many  components  that  comprise  the  suspension  system  of  an  automobile.  Often  the  coupled  equations  of 
motion  comprise  a set  of  coupled  second-order  differential  equations.  The  first  approach  to  solve  such  a 
system  is  to  try  an  analytic  solution  comprising  a general  solution  of  the  inhomogeneous  equation  plus  one 
particular  solution  of  the  inhomogeneous  equation.  Another  approach  is  to  employ  numeric  integration  using 
a computer. 


2.13.2  Successive  approximation 

When  the  system  of  coupled  differential  equations  of  motion  is  too  complicated  to  solve  analytically  one 
can  use  the  method  of  successive  approximation.  The  differential  equations  are  transformed  to  integral 
equations.  Then  one  starts  with  some  initial  conditions  to  make  a first  order  estimate  of  the  functions.  The 
functions  determined  by  this  first  order  estimate  then  are  used  in  a second  iteration  and  this  is  repeated 
until  the  solution  converges.  An  example  of  this  approach  is  when  making  Hartree-Foch  calculations  of  the 
electron  distributions  in  an  atom.  The  first  order  calculation  uses  the  electron  distributions  predicted  by 
the  one-electron  model  of  the  atom.  This  result  then  is  used  to  compute  the  influence  of  the  electron  charge 
distribution  around  the  nucleus  on  the  charge  distribution  of  the  atom  for  a second  iteration  etc. 

2.13.3  Perturbation  method 

The  perturbation  technique  can  be  applied  if  the  force  separates  into  two  parts  F = F1+F2  where  F\  » F2 
and  the  solution  is  known  for  the  dominant  F\  part  of  the  force.  Then  the  correction  to  this  solution  due 
to  addition  of  the  perturbation  F2  usually  is  easier  to  evaluate.  As  an  example,  consider  that  one  of  the 
Space  Shuttle  thrusters  fires.  In  principle  one  has  all  the  gravitational  forces  acting  plus  the  thrust  force 
of  the  thruster.  The  perturbation  approach  is  to  assume  that  the  trajectory  of  the  Space  Shuttle  in  the 
earth’s  gravitational  field  is  known.  Then  the  perturbation  to  this  motion  due  to  the  very  small  thrust, 
produced  by  the  thruster,  is  evaluated  as  a small  correction  to  the  motion  in  the  Earth’s  gravitational  field. 
This  perturbation  technique  is  used  extensively  in  physics,  especially  in  quantum  physics.  An  example 
from  my  own  research  is  scattering  of  a 1 GeV  20sPb  ion  in  the  Coulomb  field  of  a 197 Au  nucleus.  The 
trajectory  for  elastic  scattering  is  simple  to  calculate  since  neither  nucleus  is  excited  and  the  total  energy  and 
momenta  are  conserved.  However,  usually  one  of  these  nuclei  will  be  internally  excited  by  the  electromagnetic 
interaction.  This  is  called  Coulomb  excitation.  The  effect  of  the  Coulomb  excitation  usually  can  be  treated  as 
a perturbation  by  assuming  that  the  trajectory  is  given  by  the  elastic  scattering  solution  and  then  calculate 
the  excitation  probability  assuming  the  Coulomb  excitation  of  the  nucleus  is  a small  perturbation  to  the 
trajectory. 
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2.14  Newton’s  Law  of  Gravitation 

Gravitation  plays  a fundamental  role  in  classical  mechan- 
ics as  well  as  being  an  important  example  of  a conservative 
central  force.  Although  you  may  not  be  familiar  with 
the  following  presentation  addressing  the  gravitational  held 
g,  it  is  assumed  that  you  have  met  the  identical  discus- 
sion when  addressing  the  electric  held  E in  electrostatics. 
The  only  difference  is  that  mass  m replaces  charge  e and 
gravitational  held  g replaces  the  electric  held  E.  Thus  this 
chapter  is  designed  to  be  a review  of  the  concepts  that  can 
be  used  for  study  of  any  conservative  inverse-square  law 
central  helds. 

In  1666  Newton  formulated  the  Theory  of  Gravitation 
which  he  eventually  published  in  the  Principia  in  1687.  New- 
ton’s Law  of  Gravitation  states  that  each  mass  particle  at- 
tracts every  other  particle  in  the  universe  with  a force  that 
varies  directly  as  the  product  of  the  mass  and  inversely  as 
the  square  of  the  distance  between  them.  That  is,  the  force 
on  a gravitational  point  mass  rag  produced  by  a mass  Mg 


= -G 


me  Mq, 


p(r')dx'dy,dz' 


(2.143) 


Figure  2.7:  Gravitational  force  on  mass  m due 
to  an  inhnitessimal  volume  element  of  the  mass 
density  distribution. 


where  r is  the  unit  vector  pointing  from  the  gravitational  mass  Mq  to  the  gravitational  mass  me  as  shown 
in  figure  2.7.  Note  that  the  force  is  attractive,  that  is,  it  points  toward  the  other  mass.  This  is  in  contrast  to 
the  repulsive  electrostatic  force  between  two  similar  charges.  Newton’s  law  was  verified  by  Cavendish  using 
a torsion  balance.  The  experimental  value  of  G = (6.6726  ± 0.0008)  x 10-11iV  • m2 /kg2. 

The  gravitational  force  between  point  particles  can  be  extended  to  finite-sized  bodies  using  the  fact  that 
the  gravitational  force  field  satisfies  the  superposition  principle,  that  is,  the  net  force  is  the  vector  sum  of  the 
individual  forces  between  the  component  point  particles.  Thus  the  force  summed  over  the  mass  distribution 


\ - m,Gi  ^ 
= ~CmG  -pTv' 


(2.144) 


where  r.;  is  the  vector  from  the  gravitational  mass  rriGi  to  the  gravitational  mass  me  at  the  position  r. 

For  a continuous  gravitational  mass  distribution  pG  (r'),  the  net  force  on  the  gravitational  mass  me  at 
the  location  r can  be  written  as 


F m (r)  = —Gnic  [ 
Jv 


f Pg  (r  ) (r  - r ) 

I V , ' dv' 

v (r-f ') 


(2.145) 


where  dv'  is  the  volume  element  at  the  point  r'  as  illustrated  in  figure  2.7. 


2.14.1  Gravitational  and  inertial  mass 

Newton’s  Laws  use  the  concept  of  inertial  mass  mi  = m in  relating  the  force  F to  acceleration  a 

F = mi  a 


(2.146) 


and  momentum  p to  velocity  v 

p = m/v  (2.147) 

That  is,  inertial  mass  is  the  constant  of  proportionality  relating  the  acceleration  to  the  applied  force. 

The  concept  of  gravitational  mass  me  is  the  constant  of  proportionality  between  the  gravitational  force 
and  the  amount  of  matter.  That  is,  on  the  surface  of  the  earth,  the  gravitational  force  is  assumed  to  be 


(2.148) 


■Cl  ■r— \ lllGi  ~ 
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where  g is  the  gravitational  field  which  is  a position-dependent  force  per  unit  gravitational  mass  pointing 
towards  the  center  of  the  Earth.  The  gravitational  mass  is  measured  when  an  object  is  weighed. 

Newton’s  Law  of  Gravitation  leads  to  the  relation  for  the  gravitational  field  g (r)  at  the  location  r due 
to  a gravitational  mass  distribution  at  the  location  r'  as  given  by  the  integral  over  the  gravitational  mass 
density  pG 

f PG(r') 

g (r)  = -G/  V 2 ’ dv'  (2.149) 

Jv  (r  — r ) 

The  acceleration  of  matter  in  a gravitational  field  relates  the  gravitational  and  inertial  masses 

FG  = mag  = m/a  (2.150) 


Thus 


mG 

a = g 

mi 


(2.151) 


That  is,  the  acceleration  of  a body  depends  on  the  gravitational  strength  g and  the  ratio  of  the  gravitational 
and  inertial  masses.  It  has  been  shown  experimentally  that  all  matter  is  subject  to  the  same  acceleration 
in  vacuum  at  a given  location  in  a gravitational  field.  That  is,  is  a constant  common  to  all  materials. 
Galileo  first  showed  this  when  he  dropped  objects  from  the  Tower  of  Pisa.  Modern  experiments  have  shown 
that  this  is  true  to  5 parts  in  1013. 

The  exact  equivalence  of  gravitational  mass  and  inertial  mass  is  called  the  weak  principle  of  equiva- 
lence which  underlies  the  General  Theory  of  Relativity  as  discussed  in  chapter  14.  It  is  convenient  to  use 
the  same  unit  for  the  gravitational  and  inertial  masses  and  thus  they  both  can  be  written  in  terms  of  the 
common  mass  symbol  m. 

mi  = me  = m (2.152) 


Therefore  the  subscripts  G and  I can  be  omitted  in  equations  2.150  and  2.152.  Also  the  local  acceleration 
due  to  gravity  a can  be  written  as 

a = g (2.153) 

The  gravitational  field  g = ^ has  units  of  N/kg  in  the  MKS  system  while  the  acceleration  a has  units  m/s2. 


2.14.2  Gravitational  potential  energy  U 


It  was  shown  that  for  a conservative  field  it  is  possible  to  use  the 
concept  of  a potential  energy  U( r)  which  depends  on  position.  The 
potential  energy  difference  A Ua^b  between  two  points  ra  and  r/,  is 
the  work  done  moving  from  a to  b against  a force  F.  That  is: 

A Ua-+b  = U(rb)  - U (r0)  = - f " F • dl  (2.154) 

J ra 

In  general,  this  line  integral  depends  on  the  path  taken. 

Consider  the  gravitational  field  produced  by  a single  point  mass 
To/.  The  work  done  moving  a mass  mo  from  ra  to  rb  in  this  gravita- 
tional field  can  be  calculated  along  an  arbitrary  path  shown  in  figure 
2.8  by  assuming  Newton’s  law  of  gravitation.  Then  the  force  on  mo 
due  to  point  mass  mi  is; 


b 


F = -G^p-  r (2.155) 

„ „ Figure  2.8:  Work  done  against  a 

Expressing  d\  in  spherical  coordinates  dl  =drr+rd99+r  sinOdfep  gives  force  field  moving  from  a to  b. 
the  path  integral  (2.154)  from  ( ra6a<j)a ) to  (rbObfb)  is 


rb  r 


AUa^b  = F d\  = 

J a 

= —Gmimo 


G—  ira°  (r-ri dr  + r ■ OdQ  + r sin  9r  ■ 4>d(f) 


= G 


mimo 


r • r dr 


1 1 

H ra 


(2.156) 
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since  the  scalar  product  of  the  unit  vectors  ? • r = 1.  Note  that  the  second  two  terms  also  cancel  since 
r • 9 = r • <f>  = 0 since  the  unit  vectors  are  mutually  orthogonal.  Thus  the  line  integral  just  depends  only  on 
the  starting  and  ending  radii  and  is  independent  of  the  angular  coordinates  or  the  detailed  path  taken  between 
(ra0a(fa)  and  ( rb9b(j)b ) . 

Consider  the  Principle  of  Superposition  for  a gravitational  field  produced  by  a set  of  n point  masses.  The 
line  integral  then  can  be  written  as: 

rrb  n rrb  n 

a = - / F net  • d\  = - Y,  / F,  • dl  = £ A U^b  (2.157) 

Jra  i=l  Jra  i=1 

Thus  the  net  potential  energy  difference  is  the  sum  of  the  contributions  from  each  point  mass  producing  the 
gravitational  force  field.  Since  each  component  is  conservative,  then  the  total  potential  energy  difference  also 
must  be  conservative.  For  a conservative  force,  this  line  integral  is  independent  of  the  path  taken , it  depends 
only  on  the  starting  and  ending  positions,  ra  and  r;,.  That  is,  the  potential  energy  is  a local  function 
dependent  only  on  position.  The  usefulness  of  gravitational  potential  energy  is  that,  since  the  gravitational 
force  is  a conservative  force,  it  is  possible  to  solve  many  problems  in  classical  mechanics  using  the  fact 
that  the  sum  of  the  kinetic  energy  and  potential  energy  is  a constant.  Note  that  the  gravitational  field  is 
conservative,  since  the  potential  energy  difference  A Ufffb  is  independent  of  the  path  taken.  It  is  conservative 
because  the  force  is  radial  and  time  independent,  it  is  not  due  to  the  dependence. 

2.14.3  Gravitational  potential  <f> 

Using  F = TOog  gives  that  the  change  in  potential  energy  due  to  moving  a mass  mo  from  a to  b in  a 
gravitational  field  g is: 


A Uf£b  = - mo  / g net  ■ dl  (2.158) 

J ra 

Note  that  the  probe  mass  mo  factors  out  from  the  integral.  It  is  convenient  to  define  a new  quantity  called 
gravitational  potential  <j>  where 


ATjnet  rrb 

A Ktb  = *=*  = - / g net  ■ dl  (2.159) 

m0  Jra 

That  is;  gravitational  potential  difference  is  the  work  that  must  be  done,  per  unit  mass,  to  move  from  a to 
b with  no  change  in  kinetic  energy.  Be  careful  not  to  confuse  the  gravitational  potential  energy  difference 
AUa^b  and  gravitational  potential  difference  A <j>a^b,  that  is,  A U has  units  of  energy,  Joules,  while  A<j>  has 
units  of  Joules/kg. 

The  gravitational  potential  is  a property  of  the  gravitational  force  field;  it  is  given  as  minus  the  line 
integral  of  the  gravitational  field  from  a to  b.  The  change  in  gravitational  potential  energy  for  moving  a 
mass  mo  from  a to  b is  given  in  terms  of  gravitational  potential  by: 

A U£b  = m0  A<f>natb  (2-160) 


Superposition  and  potential 

Previously  it  was  shown  that  the  gravitational  force  is  conservative  for  the  superposition  of  many  masses. 
To  recap,  if  the  gravitational  field 

g net  = gl  + g2  + g3  (2.161) 

then 


ratb  = - f"  Suet  ■ dl  = - [ " gl  • dl  - [ ” g2  • dl  - / " g,3  ■ dl  = (2.162) 

J ra  J Pa  J Pa  J Pa 

Thus  gravitational  potential  is  a simple  additive  scalar  field  because  the  Principle  of  Superposition  applies. 
The  gravitational  potential,  between  two  points  differing  by  h in  height,  is  gh.  Clearly,  the  greater  g or  h, 
the  greater  the  energy  released  by  the  gravitational  field  when  dropping  a body  through  the  height  h.  The 
unit  of  gravitational  potential  is  the  e . 
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2.14.4  Potential  theory 

The  gravitational  force  and  electrostatic  force  both  obey  the  inverse  square  law,  for  which  the  field  and 
corresponding  potential  are  related  by: 

rrb 

A <j>a^b  = - gd\  (2.163) 

J ra 

For  an  arbitrary  infinitessimal  element  distance  d\  the  change  in  electric  potential  deft  is 


d(f>  = — g • dl 

Using  cartesian  coordinates  both  g and  dl  can  be  written  as 

g = i 9x  + j gy  + kgz 
Taking  the  scalar  product  gives: 


dl  = idx  + j dy  + k dz 


d<f>  = — g • dl  = - gxdx  - gvdy  - gzdz 

Differential  calculus  expresses  the  change  in  potential  d<f>  in  terms  of  partial  derivatives  by: 

d(j)  d<j>  d <f> 
d<j>  = z^-dx  + — dy  + —dz 
ox  ay  dz 


By  association,  2.166  and  2.167  imply  that 


9x  — 


d(j> 

dx 


9y  = 


d<j> 

dy 


9z 


d(j> 

dz 


(2.164) 

(2.165) 

(2.166) 

(2.167) 

(2.168) 


Thus  on  each  axis,  the  gravitational  field  can  be  written  as  minus  the  gradient  of  the  gravitational  potential. 
In  three  dimensions,  the  gravitational  field  is  minus  the  total  gradient  of  potential  and  the  gradient  of  the 
scalar  function  </>  can  be  written  as: 


In  cartesian  coordinates  this  equals 


g = -V< j) 


(2.169) 


g = - 


~d(j)  ~d(j)  z_d(p 

1 dx  + J dy  + dz 


(2.170) 


Thus  the  gravitational  field  is  just  the  gradient  of  the  gravitational  potential,  which  always  is  perpendicular 
to  the  equipotentials.  Skiers  are  familiar  with  the  concept  of  gravitational  equipotentials  and  the  fact  that 
the  line  of  steepest  descent,  and  thus  maximum  acceleration,  is  perpendicular  to  gravitational  equipotentials 
of  constant  height.  The  advantage  of  using  potential  theory  for  inverse- square  law  forces  is  that  scalar 
potentials  replace  the  more  complicated  vector  forces,  which  greatly  simplifies  calculation.  Potential  theory 
plays  a crucial  role  for  handling  both  gravitational  and  electrostatic  forces. 


2.14.5  Curl  of  the  gravitational  field 


It  has  been  shown  that  the  gravitational  field  is  conservative,  that  is 
A Ua->b  is  independent  of  the  path  taken  between  a and  b.  Therefore, 
equation  2.159  gives  that  the  gravitational  potential  is  independent  of 
the  path  taken  between  two  points  a and  b.  Consider  two  possible  paths 
between  a and  b as  shown  in  figure  2.9.  The  line  integral  from  a to  b via 
route  1 is  equal  and  opposite  to  the  line  integral  back  from  b to  a via 
route  2 if  the  gravitational  field  is  conservative  as  shown  earlier. 

A better  way  of  expressing  this  is  that  the  line  integral  of  the  gravita- 
tional field  is  zero  around  any  closed  path.  Thus  the  line  integral  between 
a and  b,  via  path  1,  and  returning  back  to  a,  via  path  2,  are  equal  and 
opposite.  That  is,  the  net  line  integral  for  a closed  loop  is  zero. 


Figure  2.9:  Circulation  of  the 
gravitational  field. 
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j>%net  ■ dl  = 0 (2.171) 

which  is  a measure  of  the  circulation  of  the  gravitational  field.  The  fact  that  the  circulation  equals  zero 
corresponds  to  the  statement  that  the  gravitational  field  is  radial  for  a point  mass. 

Stokes  Theorem,  discussed  in  appendix  H 3,  states  that 


F • dl  = 


/ Area  (V  X F)  • dS 
I bounded 


(2.172) 


Thus  the  zero  circulation  of  the  gravitational  field  can  be  rewritten  as 

g • dl  = [ Area  ( V X g)  • dS  = 0 


I bounded 
by 

c 


Since  this  is  independent  of  the  shape  of  the  perimeter  C,  therefore 

V x g = 0 


(2.173) 


(2.174) 


That  is,  the  gravitational  field  is  a curl-free  field. 

A property  of  any  curl-free  field  is  that  it  can  be  expressed  as  the  gradient  of  a scalar  potential  ft  since 


V x V4>  = 0 (2.175) 

Therefore,  the  curl-free  gravitational  field  can  be  related  to  a scalar  potential  cf>  as 

g = -V0  (2.176) 

Thus  (f>  is  consistent  with  the  above  definition  of  gravitational  potential  (j>  in  that  the  scalar  product 

A ^ = gnet  • dl  = jf  (V0)  • dl  = dx * = J"  d(t>  (2.177) 

An  identical  relation  between  the  electric  field  and  electric  potential  applies  for  the  inverse-square  law 
electrostatic  field. 


Reference  potentials: 

Note  that  only  differences  in  potential  energy,  U,  and  gravitational  potential,  </>,  are  meaningful,  the  absolute 
values  depend  on  some  arbitrarily  chosen  reference.  However,  often  it  is  useful  to  measure  gravitational 
potential  with  respect  to  a particular  arbitrarily  chosen  reference  point  <\>0  such  as  to  sea  level.  Aircraft 
pilots  are  required  to  set  their  altimeters  to  read  with  respect  to  sea  level  rather  than  their  departure 
airport.  This  ensures  that  aircraft  leaving  from  say  both  Rochester,  559  msl,  and  Denver  5000  msl,  have 
their  altimeters  set  to  a common  reference  to  ensure  that  they  do  not  collide.  The  gravitational  force  is  the 
gradient  of  the  gravitational  field  which  only  depends  on  differences  in  potential,  and  thus  is  independent  of 
any  constant  reference. 


Gravitational  potential  due  to  continuous  distributions  of  charge  Suppose  mass  is  distributed 
over  a volume  v with  a density  p at  any  point  within  the  volume.  The  gravitational  potential  at  any  field 
point  p due  to  an  element  of  mass  dm  = pdv  at  the  point  p'  is  given  by: 


A0oo— >p 


p(p')dv' 

rp'p 


(2.178) 


This  integral  is  over  a scalar  quantity.  Since  gravitational  potential  ^ is  a scalar  quantity,  it  is  easier  to 
compute  than  is  the  vector  gravitational  field  g . If  the  scalar  potential  field  is  known,  then  the  gravitational 
field  is  derived  by  taking  the  gradient  of  the  gravitational  potential. 
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2.14.6  Gauss’s  Law  for  Gravitation 


The  flux  <f>  of  the  gravitational  field  g through  a surface 
S,  as  shown  in  figure  2.10,  is  defined  as 


$=  / g • dS 

Js 


(2.179) 


Note  that  there  are  two  possible  perpendicular  directions 
that  could  be  chosen  for  the  surface  vector  dS.  Using 
Newton’s  law  of  gravitation  for  a point  mass  m the  flux 
through  the  surface  S is 


$ 


-Gm 


fr-dS 
ls~ — 


(2.180) 


Note  that  the  solid  angle  subtended  by  the  surface  dS 
at  an  angle  9 to  the  normal  from  the  point  mass  is  given 

by 

cos  OdS  r • dS  . 

d9-  = = — (2.i8i) 

Thus  the  net  gravitational  flux  equals 


Figure  2.10:  Flux  of  the  gravitational  field  through 
an  infinitessimal  surface  element  dS. 


$ = -Gm.  / dO 

Js 


(2.182) 


Consider  a dosed  surface  where  the  direction  of  the  surface  vector  dS  is  defined  as  outwards.  The  net 
flux  out  of  this  closed  surface  is  given  by 


$ 


= —Gm  l Ofi  = -< 


Js 


Gm  (t  dfl  = —GmAw 


(2.183) 


This  is  independent  of  where  the  point  mass  lies  within  the  closed  surface  or  on  the  shape  of  the  closed 
surface.  Note  that  the  solid  angle  subtended  is  zero  if  the  point  mass  lies  outside  the  closed  surface.  Thus 
the  flux  is  as  given  by  equation  2.183  if  the  mass  is  enclosed  by  the  closed  surface,  while  it  is  zero  if  the  mass 
is  outside  of  the  closed  surface. 

Since  the  flux  for  a point  mass  is  independent  of  the  location  of  the  mass  within  the  volume  enclosed  by 
the  closed  surface,  and  using  the  principle  of  superposition  for  the  gravitational  field,  then  for  n enclosed 
point  masses  the  net  flux  is 

P n 

<F  = / g • dS  = — 4ttG  ^ mi  (2.184) 

i 

This  can  be  extended  to  continuous  mass  distributions,  with  local  mass  density  p , giving  that  the  net  flux 

r r 

(2.185) 

: Losea 
lume 

Gauss’s  Divergence  Theorem  was  given  in  appendix  H2  as 


$ = / g • dS  = —4 ttG  / pdv 

J a J enclosed 

volume 


$ = <b  F • dS  = / V ■ Fdv 

J g J Enclosed 

volume 

Applying  the  Divergence  Theorem  to  Gauss’s  law  gives  that 


(2.186) 


$=  j>g-dS  = 


'Enclosed 

volume 


V • g dv  = 


-47 tG  / pdv 

J enclosed 
volume 


or 


/ Enclosed 
volume 


[V  • g + 47 jGp\  dv  = 0 


(2.187) 
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This  is  true  independent  of  the  shape  of  the  surface,  thus  the  divergence  of  the  gravitational  field 


V ■ g = —47 xGp 


(2.188) 


This  is  a statement  that  the  gravitational  field  of  a point  mass  has  a \ dependence. 

Using  the  fact  that  the  gravitational  field  is  conservative,  this  can  be  expressed  as  the  gradient  of  the 


gravitational  potential  </>, 

era 

II 

< 

-0- 

(2.189) 

and  Gauss’s  law,  then  becomes 

V ■ V0  — 47t Gp 

(2.190) 

which  also  can  be  written  as  Poisson’s  equation 

V20  = AnGp 

(2.191) 

Knowing  the  mass  distribution  p allows  determination  of  the  potential  by  solving  Poisson’s  equation. 
A special  case  that  often  is  encountered  is  when  the  mass  distribution  is  zero  in  a given  region.  Then  the 
potential  for  this  region  can  be  determined  by  solving  Laplace’s  equation  with  known  boundary  conditions. 

V2(f)  = 0 (2.192) 

For  example,  Laplace’s  equation  applies  in  the  free  space  between  the  masses.  It  is  used  extensively  in  elec- 
trostatics to  compute  the  electric  potential  between  charged  conductors  which  themselves  are  equipotentials. 


2.14.7  Condensed  forms  of  Newton’s  Law  of  Gravitation 


The  above  discussion  has  resulted  in  several  alternative  expressions  of  Newton’s  Law  of  Gravitation  that  will 
be  summarized  here.  The  most  direct  statement  of  Newton’s  law  is 


g(r)  = -G 


p (r;)  (r  — r' ) 

^ r-L, dv ' 

(r  - r') 


(2.193) 


An  elegant  way  to  express  Newton’s  Law  of  Gravitation  is  in  terms  of  the  flux  and  circulation  of  the 
gravitational  field.  That  is, 

Flux: 


$ = / g • dS  = -4? tG 


/ enclosed 
volume 


pdv 


(2.194) 


Circulation: 


g net  • dl  = 0 


(2.195) 


The  flux  and  circulation  are  better  expressed  in  terms  of  the  vector  differential  concepts  of  divergence 
and  curl. 

Divergence: 

V ■ g = -47 xGp  (2.196) 

Curl: 

V x g = 0 (2.197) 


Remember  that  the  flux  and  divergence  of  the  gravitational  field  are  statements  that  the  field  between 
point  masses  has  a ^ dependence.  The  circulation  and  curl  are  statements  that  the  field  between  point 
masses  is  radial. 

Because  the  gravitational  field  is  conservative  it  is  possible  to  use  the  concept  of  the  scalar  potential 
field  </>.  This  concept  is  especially  useful  for  solving  some  problems  since  the  gravitational  potential  can  be 
evaluated  using  the  scalar  integral 


^0oo— 


p{p')dv' 

rp'p 


(2.198) 
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An  alternate  approach  is  to  solve  Poisson’s  equation  if  the  boundary  values  and  mass  distributions  are  known 
where  Poisson’s  equation  is: 

V2(/>  = 4t tGP  (2.199) 

These  alternate  expressions  of  Newton’s  law  of  gravitation  can  be  exploited  to  solve  problems.  The 
method  of  solution  is  identical  to  that  used  in  electrostatics. 


2.16  Example:  Field  of  a uniform  sphere 

Consider  the  simple  case  of  the  gravitational  field  due  to  a uniform  sphere  of  matter  of  radius  R and 
mass  M.  Then  the  volume  mass  density 

3 M 
^ 47t  R3 

The  gravitational  field  and  potential  for  this  uniform  sphere  of  matter  can  be  derived  three  ways; 

a)  The  field  can  be  evaluated  by  directly  integrating  over  the  volume 


g(r)  = -G 


P(r')  (r-r') 
(r  - r')2 


dV' 


b)  The  potential  can  be  evaluated  directly  by  integration  of 


A0O 


p{v')dV 

Tp'p 


and  then 


g = -v< 


c)  The  obvious  spherical  symmetry  can  be  used  in  conjunction 
with  Gauss ’s  law  to  easily  solve  this  problem. 


g • dS  = -4t tG 


I enclosed 
volume 


pdv 


That  is:  for  r > R 

Similarly,  for  r < R 


47t r2g  (r)  = —AttGM 


g = —G—r 


. 9 f \ ITT  q 

47t r g (r)  = —rp 

O 


(r>R) 

(r>R) 

(r<R) 


Gravitational  field  g and  gravitational 
potential  $ of  a uniformly-dense 
spherical  mass  distribution  of  radius  R. 


That  is: 

M 

g = ~Gft3r  (r<R) 

The  field  inside  the  Earth  is  radial  and  is  proportional  to  the  distance  from  the  center  of  the  Earth.  This 
is  Hooke ’s  Law,  and  thus  ignoring  air  drag , any  body  dropped  down  a hole  through  the  center  of  the  Earth 

will  undergo  harmonic  oscillations  with  an  angular  frequency  of  wo  = J This  gives  a period  of 

oscillation  of  1.4  hours,  which  is  about  the  length  of  a classical  mechanics  lecture,  which  may  seem  like  a 
long  time. 

Clearly  method  (c)  is  much  simpler  to  solve  for  this  case.  In  general,  look  for  a symmetry  that  allows 
identification  of  a surface  upon  which  the  magnitude  and  direction  of  the  field  is  constant.  For  such  cases 
use  Gauss’s  law.  Otherwise  use  methods  (a)  or  (b)  whichever  one  is  easiest  to  apply.  Further  examples  will 
not  be  given  here  since  they  are  essentially  identical  to  those  discussed  extensively  in  electrostatics. 
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2.15  Summary 


Newton’s  Laws  of  Motion: 

A cursory  review  of  Newtonian  mechanics  has  been  presented.  The  concept  of  inertial  frames  of  reference 
was  introduced  since  Newton’s  laws  of  motion  apply  only  to  inertial  frames  of  reference. 

Newton’s  Law  of  motion 


(2.6) 


leads  to  second-order  equations  of  motion  which  can  be  difficult  to  handle  for  many-body  systems. 

Solution  of  Newton’s  second-order  equations  of  motion  can  be  simplified  using  the  three  first-order  in- 
tegrals coupled  with  corresponding  conservation  laws.  The  first-order  time  integral  for  linear  momentum 
is 


r2  dpi  / \ 
-£-<#=  (Pa -Pl), 


(2.10) 


The  first-order  time  integral  for  angular  momentum  is 


dLi  dpi 

— — = r i x — — = 

dt  dt 


N; 


(L2 


Li  )i 


The  first-order  spatial  integral  is  related  to  kinetic  energy  and  the  concept  of  work. 


That  is 


(2.16) 


F i = 


dJ\ 

drt 


drt  = (T2  - Ti)j 


(2.21) 


The  conditions  that  lead  to  conservation  of  linear  and  angular  momentum  and  total  mechanical  energy 
were  discussed  for  many-body  systems.  The  important  class  of  conservative  forces  was  shown  to  apply  if 
the  position-dependent  force  do  not  depend  on  time  or  velocity,  and  if  the  work  done  by  a force  J2  F,;  • dr,; 
is  independent  of  the  path  taken  between  the  initial  and  final  locations.  The  total  mechanical  energy  is  a 
constant  of  motion  when  the  forces  are  conservative. 

It  was  shown  that  the  concept  of  center  of  mass  of  a many-body  or  finite  sized  body  separates  naturally 
for  all  three  first-order  integrals.  The  center  of  mass  is  that  point  about  which 


I v'pdV  = 0. 


(Centre  of  mass  definition) 


where  r'  is  the  vector  defining  the  location  of  mass  to,  with  respect  to  the  center  of  mass.  The  concept  of 
center  of  mass  greatly  simplifies  the  description  of  the  motion  of  hnite-sized  bodies  and  many-body  systems 
by  separating  out  the  important  internal  interactions  and  corresponding  underlying  physics,  from  the  trivial 
overall  translational  motion  of  a many-body  system.. 

The  Virial  theorem  states  that  the  time-averaged  properties  are  related  by 


{T)  = - 


1 

2 


(2.86) 


It  was  shown  that  the  Virial  theorem  is  useful  for  relating  the  time-averaged  kinetic  and  potential  energies, 
especially  for  cases  involving  either  linear  or  inverse- square  forces. 

Typical  examples  were  presented  of  application  of  Newton’s  equations  of  motion  to  solving  systems 
involving  constant,  linear,  position-dependent,  velocity-dependent,  and  time-dependent  forces,  to  constrained 
and  unconstrained  systems,  as  well  as  systems  with  variable  mass.  Rigid-body  rotation  about  a body-fixed 
rotation  axis  also  was  discussed. 

It  is  important  to  be  cognizant  of  the  following  limitations  that  apply  to  Newton’s  laws  of  motion: 

1)  Newtonian  mechanics  assumes  that  all  observables  are  measured  to  unlimited  precision,  that  is  t,  E, 
p,r  are  known  exactly.  Quantum  physics  introduces  limits  to  measurement  due  to  wave-particle  duality. 

2)  The  Newtonian  view  is  that  time  and  position  are  absolute  concepts.  The  Theory  of  Relativity  shows 
that  this  is  not  true.  Fortunately  for  most  problems  v « c and  thus  Newtonian  mechanics  is  an  excellent 
approximation. 
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3)  Another  limitation,  to  be  discussed  later,  is  that  it  is  impractical  to  solve  the  equations  of  motion 
for  many  interacting  bodies  such  as  molecules  in  a gas.  Then  it  is  necessary  to  resort  to  using  statistical 
averages,  this  approach  is  called  statistical  mechanics. 

Newton’s  work  constitutes  a theory  of  motion  in  the  universe  that  introduces  the  concept  of  causality. 
Causality  is  that  there  is  a one-to-one  correspondence  between  cause  of  effect.  Each  force  causes  a known 
effect  that  can  be  calculated.  Thus  the  causal  universe  is  pictured  by  philosophers  to  be  a giant  machine 
whose  parts  move  like  clockwork  in  a predictable  and  predetermined  way  according  to  the  laws  of  nature.  This 
is  a deterministic  view  of  nature.  There  are  philosophical  problems  in  that  such  a deterministic  viewpoint 
appears  to  be  contrary  to  free  will.  That  is,  taken  to  the  extreme  it  implies  that  you  were  predestined  to 
read  this  book  because  it  is  a natural  consequence  of  this  mechanical  universe! 

Newton’s  Laws  of  Gravitation 

Newton’s  Laws  of  Gravitation  and  the  Laws  of  Electrostatics  are  essentially  identical  since  they  both 
involve  a central  inverse  square-law  dependence  of  the  forces.  The  important  difference  is  that  the  gravi- 
tational force  is  attractive  whereas  the  electrostatic  force  between  identical  charges  is  repulsive.  That  is, 
the  gravitational  constant  G is  replaced  by  — and  the  mass  density  p becomes  the  charge  density  for 
the  case  of  electrostatics.  As  a consequence  it  is  unnecessary  to  make  a detailed  study  of  Newton’s  law  of 
gravitation  since  it  is  identical  to  what  has  already  been  studied  in  your  accompanying  electrostatic  courses. 
Table  2.1  summarizes  and  compares  the  laws  of  gravitation  and  electrostatics.  For  both  gravitation  and 
electrostatics  the  field  is  central  and  conservative  and  depends  as  4rf. 

The  laws  of  gravitation  and  electrostatics  can  be  expressed  in  a more  useful  form  in  terms  of  the  flux  and 
circulation  of  the  gravitational  field  as  given  either  in  the  vector  integral  or  vector  differential  forms.  The 
radial  independence  of  the  flux,  and  corresponding  divergence,  is  a statement  that  the  fields  are  radial  and 
have  a -%r  dependence.  The  statement  that  the  circulation,  and  corresponding  curl,  are  zero  is  a statement 
that  the  fields  are  radial  and  conservative. 

Table  3.1;  Comparison  of  Newton’s  law  of  gravitation  and  electrostatics. 


Gravitation 

Electrostatics 

Force  field 

nr  zz  G 

® m 

E = iA 

Q 

Density 

Mass  density  p(  r') 

Charge  density  p (r7) 

Conservative  central  field 

gW-  Gfy 

Flux 

$ = fg  g • dS  = —47 tG  J. enclosed  pdv 

volume 

^ = Jg  E • = en  f enclosed  pdv 

u volume 

Circulation 

f gnet  ' dl  = 0 

j>  E net  • dl  = 0 

Divergence 

V • g = —47 iGp 

V-E  = 

eor 

Curl 

V x g = 0 

V x E = 0 

Potential 

= -Of,  Af-Si 

A 6 - 1 f p^')dv‘ 

*’00  >p  4ne0  Jv  rv,v 

Poisson’s  equation 

V2(/>  = 47 xGp 

Both  the  gravitational  and  electrostatic  central  fields  are  conservative  making  it  possible  to  use  the 
concept  of  the  scalar  potential  field  </>.  This  concept  is  especially  useful  for  solving  some  problems  since  the 
potential  can  be  evaluated  using  a scalar  integral.  An  alternate  approach  is  to  solve  Poisson’s  equation  if  the 
boundary  values  and  mass  distributions  are  known.  The  methods  of  solution  of  Newton’s  law  of  gravitation 
are  identical  to  those  used  in  electrostatics  and  are  readily  accessible  in  the  literature. 
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Workshop  exercises 

1.  Spend  a few  minutes  looking  over  the  following  problems,  paying  particular  attention  to  the  problems  that 
you  think  you  might  have  trouble  with.  All  of  the  problems  are  taken  from  an  introductory  physics  course  on 
mechanics,  so  this  should  seem  like  review  material.  After  you  have  had  some  time  to  look  over  the  problems, 
you  will  take  turns  stepping  up  to  the  board  to  solve  one.  When  it  is  your  turn,  you  may  pick  ANY  of  the 
problems  that  have  not  already  been  solved.  Depending  on  the  number  of  students  in  the  recitation,  you  may 
be  asked  to  solve  more  than  one  problem.  Good  luck! 

(a)  Justin  fires  a 12-gram  bullet  into  a block  of  wood.  The  bullet  travels  at  190  m/s,  penetrates  the  2.0-kg 
block  of  wood,  and  emerges  going  150  m/s.  If  the  block  is  stationary  on  a frictionless  surface  when  hit, 
how  fast  does  it  move  after  the  bullet  emerges? 

(b)  A mass  to  at  the  end  of  a spring  vibrates  with  a frequency  of  0.88  Hz;  when  an  additional  1.25  kg  mass 
is  added  to  to,  the  frequency  is  0.48  Hz.  What  is  the  value  of  to? 

(c)  Dan  has  a new  chandelier  in  his  living  room.  The  chandelier  is  27-kg  and  it  hangs  from  the  ceiling  on  a 
vertical  4.0-m-long  wire.  What  horizontal  force  would  Dan  need  to  use  to  displace  its  position  0.10  nr  to 
one  side?  What  will  be  the  tension  in  the  wire? 

(d)  Dianne  has  a new  spring  with  a spring  constant  of  900  N/m  that  she  bought  at  Springs-R-Us.  She  places 
it  vertically  on  a table  and  compresses  it  by  0.150  nr.  What  upward  speed  can  it  give  to  a 0.300-kg  ball 
when  released? 

(e)  A tiger  leaps  horizontally  from  a 6.5-nr-high  rock  with  a speed  of  4.0  nr/s.  How  far  from  the  base  of  the 
rock  will  she  land? 

(f)  How  much  work  must  SuperRyan  do  to  stop  a 1300-kg  car  traveling  at  100  km/hr? 

(g)  Jason  catches  a baseball  3.1  s after  throwing  it  vertically  upward.  With  what  speed  did  he  throw  it  and 
what  height  did  it  reach? 

(h)  Laura  is  practicing  her  figure  skating  and  during  her  finale  she  can  increase  her  rotation  rate  from  an 
initial  rate  of  1.0  rev  every  2.0  s to  a final  rate  of  3.0  rev/s.  If  her  initial  moment  of  inertia  was  4.6  kg-m2, 
what  is  her  final  moment  of  inertia? 

(i)  On  an  icy  day  in  Rochester  (imagine  that!),  you  worry  about  parking  your  car  in  your  driveway,  which 
has  an  incline  of  12°.  Your  neighbor  Emily’s  driveway  has  an  incline  of  9°,  and  Brian’s  driveway  across 
the  street  has  one  of  6°.  The  coefficient  of  static  friction  between  tire  rubber  and  ice  is  0.15.  Which 
driveway(s)  will  be  safe  to  park  a car? 

2.  Two  particles  are  projected  from  the  same  point  with  velocities  V\  and  t>2,  at  elevations  ai  and  Ct2>  respectively 
(aq  > Ot2)-  Show  that  if  they  are  to  collide  in  mid-air  the  interval  between  the  firings  must  be 

2v\V2  sin(aq  — a^) 
g(v i cos  aq  + V2  cos  0:2) 

(If  you  don’t  have  time  to  solve  this  problem  completely,  then  at  least  give  an  outline  of  how  you  would  go 
about  solving  the  problem.) 

3.  Read  each  of  the  following  statements  and,  without  consulting  anyone  else,  mark  them  true  or  false.  If  you  are 
unsure  of  any  of  them,  make  a guess.  Once  everyone  has  answered  each  of  the  statements  individually,  break 
into  small  groups  and  compare  your  answers.  Try  to  come  to  an  agreement  as  a group.  The  Teaching  Assistant 
will  then  make  sure  everyone  has  the  correct  answer.  Good  luck! 

(a)  The  conservation  of  linear  momentum  is  a consequence  of  translational  symmetry,  or  the  homogeneity  of 
space. 

(b)  For  an  isolated  system  with  no  external  forces  acting  on  it,  the  angular  momentum  will  remain  constant 
in  both  magnitude  and  direction. 

(c)  A reference  frame  is  called  an  inertial  frame  if  Newton’s  laws  are  valid  in  that  frame. 

(d)  Newtonian  mechanics  and  the  laws  of  electromagnetism  are  invariant  under  Galilean  transformations. 
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(e)  The  law  of  conservation  of  angular  momentum  is  a consequence  of  rotational  symmetry,  or  the  isotropy 
of  space. 

(f)  The  center  of  mass  of  a system  of  particles  moves  like  a single  particle  of  mass  M (total  mass  of  the 
system)  acted  on  by  a single  force  F that  is  equal  to  the  sum  of  all  the  external  forces  acting  on  the 
system. 

(g)  If  Newton’s  laws  are  valid  in  one  reference  frame,  then  they  are  also  valid  in  any  reference  frame  accelerated 
with  respect  to  the  first  system. 

(h)  The  law  of  conservation  of  energy  is  a consequence  of  inversion  symmetry,  or  the  invertibility  of  space. 

4.  The  teeter  totter  comprises  two  identical  weights  which  hang  on  drooping  arms  attached  to  a peg  as  shown. 

The  arrangement  is  unexpectedly  stable  and  can  be  spun  and  rocked  with  little  danger  of  toppling  over. 


(a)  Find  an  expression  for  the  potential  energy  of  the  teeter  toy  as  a function  of  9 when  the  teeter  toy  is 
cocked  at  an  angle  9 about  the  pivot  point.  For  simplicity,  consider  only  rocking  motion  in  the  vertical 
plane. 

(b)  Determine  the  equilibrium  values(s)  of  9. 

(c)  Determine  whether  the  equilibrium  is  stable,  unstable,  or  neutral  for  the  value(s)  of  9 found  in  part  (b). 

(d)  How  could  you  determine  the  answers  to  parts  (b)  and  (c)  from  a graph  of  the  potential  energy  versus  91 

(e)  Expand  the  expression  for  the  potential  energy  about  9 = 0 and  determine  the  frequency  of  small 
oscillations. 

5.  For  each  of  the  situations  described  below,  determine  which  of  the  four  functional  forms  of  the  force  is  most 
appropriate.  Consider  motion  only  along  one  dimension. 

• Constant  force:  F = constant 

• Time-dependent  force:  F = F(t ) 

• Velocity-dependent  force:  F = F(v) 

• Distance-dependent  force:  F = F(x ) 

Go  around  the  room  and  take  turns  answering  a question.  When  it  is  your  turn,  pick  a functional  form  and 
explain  why  you  chose  the  one  you  did.  If  you  are  unsure,  make  a guess  or  ask  a question  to  get  help  from  the 
rest  of  the  workshop.  There  may  be  more  than  one  answer  depending  on  your  interpretation  of  the  situation, 
so  be  sure  to  explore  all  of  the  possibilities. 

(a)  A mass  resting  on  a frictionless  table  is  attached  to  a spring,  which  in  turn  is  attached  to  a wall.  The 
mass  is  pulled  to  the  side  and  executes  simple  harmonic  motion  in  the  horizontal  direction. 

(b)  A freely-falling  body  subject  to  a constant  gravitational  field  with  no  air  resistance. 

(c)  An  electron,  initially  at  rest  (treat  it  classically!),  encounters  an  incoming  electromagnetic  wave  of  electric 
field  intensity  E given  by  E = Eq  sin(cul  + (f> ). 

(d)  A large  mass  is  affected  by  the  gravitational  field  of  another  mass  a distance  d away. 
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(e)  A freely-falling  body  subject  to  a constant  gravitational  field  with  air  resistance. 

(f)  A charged  point  particle  is  affected  by  the  presence  of  another  charged  point  particle  a distance  d away. 

6.  A particle  of  mass  to  is  constrained  to  move  on  the  frictionless  inner  surface  of  a cone  of  half-angle  a. 

(a)  Find  the  restrictions  on  the  initial  conditions  such  that  the  particle  moves  in  a circular  orbit  about  the 
vertical  axis. 

(b)  Determine  whether  this  kind  of  orbit  is  stable.  A particle  of  mass  to  is  constrained  to  move  on  the 
frictionless  inner  surface  of  a cone  of  half-angle  a,  as  shown  in  the  figure. 

7.  Consider  a thin  rod  of  length  L and  mass  M. 

(a)  Draw  gravitational  field  lines  and  equipotential  lines  for  the  rod.  What  can  you  say  about  the  equipotential 
surfaces  of  the  rod? 

(b)  Calculate  the  gravitational  potential  at  a point  P that  is  a distance  r from  one  end  of  the  rod  and  in  a 
direction  perpendicular  to  the  rod. 

(c)  Calculate  the  gravitational  field  at  P by  direct  integration. 

(d)  Could  you  have  used  Gauss’s  law  to  find  the  gravitational  field  at  P?  Why  or  why  not? 

8.  Consider  a single  particle  of  mass  TO. 

(a)  Determine  the  position  r and  velocity  V of  a particle  in  spherical  coordinates. 

(b)  Determine  the  total  mechanical  energy  of  the  particle  in  potential  V. 

(c)  Assume  the  force  is  conservative.  Show  that  F = — W.  Show  that  it  agrees  with  Stoke’s  theorem. 

(d)  Show  that  the  angular  momentum  L = r X p of  the  particle  is  conserved.  Hint:  X B)  = 

Ax%  + %xB. 

9.  Consider  a fluid  with  density  p and  velocity  v in  some  volume  V . The  mass  current  J = pv  determines  the 
amount  of  mass  exiting  the  surface  per  unit  time  by  the  integral  fs  J ■ dA. 

(a)  Using  the  divergence  theorem,  prove  the  continuity  equation,  V ■ J + = 0 

10.  A rocket  of  initial  mass  M burns  fuel  at  constant  rate  k (kilograms  per  second),  producing  a constant  force  /. 
The  total  mass  of  available  fuel  is  m0.  Assume  the  rocket  starts  from  rest  and  moves  in  a fixed  direction  with 
no  external  forces  acting  on  it. 

(a)  Determine  the  equation  of  motion  of  the  rocket. 

(b)  Determine  the  final  velocity  of  the  rocket. 

(c)  Determine  the  displacement  of  the  rocket  in  time. 
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Problems 

1.  Consider  a solid  hemisphere  of  radius  a.  Compute  the  coordinates  of  the  center  of  mass  relative  to  the  center 
of  the  spherical  surface  used  to  define  the  hemisphere. 

2.  A 2000kg  Ford  was  travelling  south  on  Mt.  Hope  Avenue  when  it  collided  with  your  1000kg  sports  car  travelling 
west  on  Elmwood  Avenue.  The  two  badly-damaged  cars  became  entangled  in  the  collision  and  leave  a skid  mark 
that  is  20  meters  long  in  a direction  14°  to  the  west  of  the  original  direction  of  travel  of  the  Excursion.  The 
wealthy  Excursion  driver  hires  a high-powered  lawyer  who  accuses  you  of  speeding  through  the  intersection. 
Use  your  P235  knowledge,  plus  the  police  officer’s  report  of  the  recoil  direction,  the  skid  length,  and  knowledge 
that  the  coefficient  of  sliding  friction  between  the  tires  and  road  is  y = 0.6,  to  deduce  the  original  velocities  of 
both  cars.  Were  either  of  the  cars  exceeding  the  30mph  speed  limit? 

3.  A particle  of  mass  m moving  in  one  dimension  has  potential  energy  U{x)  = f7o[2(|)2  — (f  )4],  where  Uq  and  a 
are  positive  constants. 

a)  Find  the  force  F(x)  that  acts  on  the  particle. 

b)  Sketch  U(x).  Find  the  positions  of  stable  and  unstable  equilibrium. 

c)  What  is  the  angular  frequency  u>  of  oscillations  about  the  point  of  stable  equilibrium? 

d)  What  is  the  minimum  speed  the  particle  must  have  at  the  origin  to  escape  to  infinity? 

e)  At  t = 0 the  particle  is  at  the  origin  and  its  velocity  is  positive  and  equal  to  the  escape  velocity.  Find  x(t) 
and  sketch  the  result. 

4.  a)  Consider  a single-stage  rocket  travelling  in  a straight  line  subject  to  an  external  force  Fext  acting  along  the 
same  line  where  vex  is  the  exhaust  velocity  of  the  ejected  fuel  relative  to  the  rocket.  Show  that  the  equation  of 
motion  is 

• • . T-ieait 

mv  = —mve  x + x 

b)  Specialize  to  the  case  of  a rocket  taking  off  vertically  from  rest  in  a uniform  gravitational  field  g.  Assume 
that  the  rocket  ejects  mass  at  a constant  rate  of  m = —k  where  k is  a positive  constant.  Solve  the  equation  of 
motion  to  derive  the  dependence  of  velocity  on  time. 

c)  The  first  couple  of  minutes  of  the  launch  of  the  Space  Shuttle  can  be  described  roughly  by;  initial  mass 
= 2 x 10®  kg,  mass  after  2 minutes  = 1 x 105 6  kg,  exhaust  speed  vex  = 3000m/s,  and  initial  velocity  is  zero. 
Estimate  the  velocity  of  the  Space  Shuttle  after  two  minutes  of  flight. 

d)  Describe  what  would  happen  to  a rocket  where  mvex  < mg. 

5.  A time  independent  field  F is  conservative  if  V x F = 0.  Use  this  fact  to  test  if  the  following  fields  are 
conservative,  and  derive  the  corresponding  potential  U. 

a)  Fx  = ayz  + bx  + c,  Fy  = axz  + bz,  Fz  = axy  + by 

b)  Fx  = —ze~x,Fy  = In  z,Fz  = e~x  + f 
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6.  Consider  a solid  cylinder  of  mass  m and  radius  r sliding  without  rolling  down  the  smooth  inclined  face  of  a 
wedge  of  mass  M that  is  free  to  slide  without  friction  on  a horizontal  plane  floor.  Use  the  coordinates  shown 
in  the  figure. 

a)  How  far  has  the  wedge  moved  by  the  time  the  cylinder  has  descended  from  rest  a vertical  distance  h ? 

b)  Now  suppose  that  the  cylinder  is  free  to  roll  down  the  wedge  without  slipping.  How  far  does  the  wedge 
move  in  this  case  if  the  cylinder  rolls  down  a vertical  distance  h ? 

c)  In  which  case  does  the  cylinder  reach  the  bottom  faster?  How  does  this  depend  on  the  radius  of  the  cylinder? 


y 


7.  If  the  gravitational  field  vector  is  independent  of  the  radial  distance  within  a sphere,  find  the  function  describing 
the  mass  density  p ( r ) of  the  sphere. 


Chapter  3 


Linear  oscillators 


3.1  Introduction 

Oscillations  are  a ubiquitous  feature  in  nature.  Examples  are  periodic  motion  of  planets,  the  rise  and  fall 
of  the  tides,  water  waves,  pendulum  in  a clock,  musical  instruments,  sound  waves,  electromagnetic  waves, 
and  wave-particle  duality  in  quantal  physics.  Oscillatory  systems  all  have  the  same  basic  mathematical  form 
although  the  names  of  the  variables  and  parameters  are  different.  The  classical  linear  theory  of  oscillations 
will  be  assumed  in  this  chapter  since:  (1)  The  linear  approximation  is  well  obeyed  when  the  amplitudes  of 
oscillation  are  small,  that  is,  the  restoring  force  obeys  Hooke’s  Law.  (2)  The  Principle  of  Superposition 
applies.  (3)  The  linear  theory  allows  most  problems  to  be  solved  explicitly  in  closed  form.  This  is  in  contrast 
to  non-linear  system  where  the  motion  can  be  complicated  and  even  chaotic  as  discussed  in  chapter  4. 


3.2  Linear  restoring  forces 

An  oscillatory  system  requires  that  there  be  a stable  equilibrium  about 
which  the  oscillations  occur.  Consider  a conservative  system  with  potential 
energy  U for  which  the  force  is  given  by 

F = —VU  (3.1) 

Figure  3.1  illustrates  a conservative  system  that  has  three  locations  at 
which  the  restoring  force  is  zero,  that  is,  where  the  gradient  of  the  potential 
is  zero.  Stable  oscillations  occur  only  around  locations  1 and  3 whereas 
the  system  is  unstable  at  the  zero  gradient  location  2.  Point  2 is  called  a 
separatrix  in  that  an  infinitessimal  displacement  of  the  particle  from  this 
separatrix  will  cause  the  particle  to  diverge  towards  either  minimum  1 or 
3 depending  on  which  side  of  the  separatrix  the  particle  is  displaced. 

The  requirements  for  stable  oscillations  about  any  point  Xo  are  that 
the  potential  energy  must  have  the  following  properties. 

Stability  requirements 

1)  The  potential  has  a stable  position  for  which  the  restoring  force  is  zero,  i.e.  = 0 

2)  The  potential  U must  be  positive  and  an  even  function  of  displacement  x — Xo-  That  is.  ( ) > 0 

V Xn ) x0 

where  n is  even. 

The  requirement  for  the  restoring  force  to  be  linear  is  that  the  restoring  force  for  perturbation  about  a 
stable  equilibrium  at  Xo  is  of  the  form 

F = — a(x— xo)  = mx  (3.2) 

The  potential  energy  function  for  a linear  oscillator  has  a pure  parabolic  shape  about  the  minimum  location, 
that  is, 

U = i k(x  - x0)2  (3.3) 


U(x) 


Figure  3.1:  Stability  for  a one- 
dimensional potential  U(x). 
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where  xo  is  the  location  of  the  minimum. 

Fortunately,  oscillatory  systems  involve  small  amplitude  oscillations  about  a stable  minimum.  For  weak 
non-linear  systems,  where  the  amplitude  of  oscillation  Ax  about  the  minimum  is  small,  it  is  useful  to  make 
a Taylor  expansion  of  the  potential  energy  about  the  minimum.  That  is 


U(  Ax)  = U(x  o)  + AxdU$X°) 
ax 


Ax2  d2U  (xo) 
~~2!  dx2 


Ax3  d3U  (xo)  Ax 4 d4U  (xo) 

3!  dx3  4!  dx4 


(3.4) 


By  definition,  at  the  minimum 


dU}^  = 0,  and  thus  equation  3.3  can  be  written  as 


A U = U{ Ax)  - U(x q)  = 


Ax2  d2U  (xo) 
~2!  dx2 


Ax3  d3U  (xo)  Ax4  d4U  (xo) 
3!  dx3  4!  dx4 


(3.5) 


For  small  amplitude  oscillations,  the  system  is  linear  if  the  second-order  4gp  d term  in  equation  3.2  is 
dominant. 

The  linearity  for  small  amplitude  oscillations  greatly  simplifies  description  of  the  oscillatory  motion  and 
complicated  chaotic  motion  is  avoided.  Most  physical  systems  are  approximately  linear  for  small  amplitude 
oscillations,  and  thus  the  motion  close  to  equilibrium  approximates  a linear  harmonic  oscillator. 


3.3  Linearity  and  superposition 


An  important  aspect  of  linear  systems  is  that  the  solutions  obey  the  Principle  of  Superposition,  that  is,  for 
the  superposition  of  different  oscillatory  modes,  the  amplitudes  add  linearly.  The  linearly-damped  linear 
oscillator  is  an  example  of  a linear  system  in  that  it  involves  only  linear  operators,  that  is,  it  can  be  written 
in  the  operator  form  (appendix  F. 2) 

+r^ x(i)  = Acoswt  (3.6) 

The  quantity  in  the  brackets  on  the  left  hand  side  is  a linear  operator  that  can  be  designated  by  L where 

L x(i)  = F(t)  (3.7) 

An  important  feature  of  linear  operators  is  that  they  obey  the  principle  of  superposition.  This  property 
results  from  the  fact  that  linear  operators  are  distributive,  that  is 


L(xi  + X2)  = L (xi)  + L (X2)  (3.8) 

Therefore  if  there  are  two  solutions  xi(t)  and  x’2 (t)  for  two  different  forcing  functions  F’i(t)  and  F2(t) 

Lxi(t)  = .Fi(t)  (3.9) 

Lx2(f)  = F2(t) 

then  the  addition  of  these  two  solutions,  with  arbitrary  constants,  also  is  a solution  for  linear  operators. 

L(aixi  + 012X2)  = oi-Fi  (f)  + a2F2  (t)  (3.10) 

In  general  then 


N 


N 


L ^2  anXn{t)  = ^2  anFn(t) 


\n—  1 


\n—  1 


The  left  hand  bracket  can  be  identified  as  the  linear  combination  of  solutions 

N 

x(t)  = 5>„x„(i) 


(3.11) 


(3.12) 


while  the  driving  force  is  a linear  superposition  of  harmonic  forces 

N 

F{t)  = y^Fnjt) 


(3.13) 
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Thus  these  linear  combinations  also  satisfy  the  general  linear  equation 

L x(t)  = F(t) 


(3-14) 


Applicability  of  the  Principle  of  Superposition  to  a system  provides  a tremendous  advantage  for  handling 
and  solving  the  equations  of  motion  of  oscillatory  systems. 


3.4  Geometrical  representations  of  dynamical  motion 

The  powerful  pattern-recognition  capabilities  of  the  human  brain,  coupled  with  geometrical  representations 
of  the  motion  of  dynamical  systems,  provide  a sensitive  probe  of  periodic  motion.  The  geometry  of  the 
motion  often  can  provide  more  insight  into  the  dynamics  than  inspection  of  mathematical  functions.  A 
system  with  n degrees  of  freedom  is  characterized  by  locations  ft,  velocities  ft,  and  momenta  pi,  where 
0 < i < n,  in  addition  to  the  time  t and  instantaneous  energy  H(t).  There  are  many  possible  combinations 
of  correlations  between  these  2n  + 2 variables.  The  following  three  are  used  frequently. 

3.4.1  Configuration  space  ( ) 

A configuration  space  plot  shows  the  correlated  motion  of  two  spatial  coordinates  ft  and  ft  averaged  over 
time.  An  example  is  the  two-dimensional  linear  oscillator  with  two  equations  of  motion  and  solutions 

mx  + kxx  = 0 my  + kvy  = 0 (3.15) 

x (t)  = A cos  (ujxt)  y (t)  = B cos  (uvt  — 6)  (3.16) 

where  u = For  unequal  restoring  force  constants,  kx  ky.  the  trajectory  executes  complicated  Lis- 

sajous  figures  that  depend  on  the  angular  frequencies  ujx,uv,  and  the  phase  factor  S.  When  the  ratio  of 
the  angular  frequencies  along  the  two  axes  is  rational,  that  is  — is  a rational  fraction,  then  the  curve  will 
repeat  at  regular  intervals  as  shown  in  figure  3.2,  and  this  shape  depends  on  the  phase  difference.  Otherwise 
the  trajectory  gradually  fills  the  whole  rectangle. 


Configuration  plots  of  (x,  y)  where  x = cos(4i)  and  y = cos(5t  — 8)  at  four  different  phase  values  <5.  The 

curves  are  called  Lissajous  figures 
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3.4.2  State  space,  ( qi,qi,t ) 

Visualization  of  a trajectory  is  enhanced  by  correlation  of  configuration  q-i  and  it’s  corresponding  velocity 
<ji  which  specifies  the  direction  of  the  motion.  The  state  space  representation1  is  especially  valuable  when 
discussing  Lagrangian  mechanics  which  is  based  on  the  Lagrangian  L(q,  q ,t). 

The  free  undamped  harmonic  oscillator  provides  a simple  application  of  state  space.  Consider  a mass  m 
attached  to  a spring  with  linear  spring  constant  k for  which  the  equation  of  motion  is 

— kx  = mx  = mx^-  (3-17) 

ax 

By  integration  this  gives 

^ mx 2 + Tjkx2  = E (3.18) 

The  first  term  in  equation  3.18  is  the  kinetic  energy,  the  second  term  is  the  potential  energy,  and  E is  the 
total  energy  which  is  conserved  for  this  system.  This  equation  can  be  expressed  in  terms  of  the  state  space 
coordinates  as 

• 2 2 
x x 

2 E\  ^ (2  E\ 
m ) V k ) 

This  corresponds  to  the  equation  of  an  ellipse  for  a state-space  plot  of  x versus  x as  shown  in  figure  3.3 upper. 
The  elliptical  paths  shown  correspond  to  contours  of  constant  total  energy  which  is  partitioned  between 
kinetic  and  potential  energy.  For  the  coordinate  axis  shown,  the  motion  of  a representative  point  will  be  in 
a clockwise  direction  as  the  total  oscillator  energy  is  redistributed  between  potential  to  kinetic  energy.  The 
area  of  the  ellipse  is  proportional  to  the  total  energy  E. 


(3.19) 


3.4.3  Phase  space,  (qi,Pi,t) 


Phase  space,  which  was  introduced  by  J.W.  Gibbs  for  the  field  of  sta- 
tistical mechanics,  provides  a fundamental  graphical  representation  in 
classical  mechanics.  The  phase  space  coordinates  qiPi  are  the  conju- 
gate coordinates  (q,  p)  and  are  fundamental  to  Hamiltonian  mechanics 
which  is  based  on  the  Hamiltonian  H (q,  p ,t).  For  a conservative  system, 
only  one  phase-space  curve  passes  through  any  point  in  phase  space 
like  the  flow  of  an  incompressible  fluid.  This  makes  phase  space  more 
useful  than  state  space  where  many  curves  pass  through  any  location. 
Lanczos  [La49]  defined  an  extended  phase  space  using  four-dimensional 
relativistic  space-time  as  discussed  in  chapter  16. 

Since  px  = mx  for  the  non-relativistic,  one-dimensional,  linear  os- 
cillator, then  equation  3.19  can  be  rewritten  in  the  form 


2 mE 


(3.20) 


This  is  the  equation  of  an  ellipse  in  the  phase  space  diagram  shown 
in  Fig.3.34ower  which  looks  identical  to  Fig  3.3 -upper  since  that  the 
ordinate  variable  is  multiplied  by  the  constant  to.  That  is,  the  only 
difference  is  the  phase-space  coordinates  ( x,px ) replace  the  state-space 
coordinates  ( x , x).  State  space  plots  are  used  extensively  in  this  chapter 
to  describe  oscillatory  motion.  Although  phase  space  is  more  funda- 
mental, both  state  space  and  phase  space  plots  provide  useful  represen- 
tations for  characterizing  and  elucidating  a wide  variety  of  motion  in 
classical  mechanics.  The  following  discussion  of  the  undamped  simple 
pendulum  illustrates  the  general  features  of  state  space. 


Figure  3.2:  State  space  (upper), 
and  phase  space  (lower)  diagrams, 
for  the  linear  harmonic  oscillator. 


1 A universal  name  for  the  (q,  q)  representation  has  not  been  adopted  in  the  literature.  Therefore  this  book  has  adopted 
the  name  "state  space"  in  common  with  reference  [Ta05].  Lanczos  [La49]  uses  the  term  "state  space"  to  refer  to  the  extended 
phase  space  (q,  p,t)  discussed  in  chapter  16. 
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3.4.4  Plane  pendulum 


Consider  a simple  plane  pendulum  of  mass  m attached  to  a string  of  length  l in  a uniform  gravitational  field 
g.  There  is  only  one  generalized  coordinate,  9.  Since  the  moment  of  inertia  of  the  simple  plane-pendulum  is 
I = ml 2 then  the  kinetic  energy  is 

T = i ml26 2 (3.21) 

and  the  potential  energy  relative  to  the  bottom  dead  center  is 

U = mgl  (1  — cos0)  (3.22) 


Thus  the  total  energy  equals 

E = \ml29 2 + mgl(l  — cos0)  = + mgl  (1  — cos  9)  (3.23) 

2 2 mlz 

where  E is  a constant  of  motion.  Note  that  the  angular  momentum  pg  is  not  a constant  of  motion  since  the 
angular  acceleration  pg  explicitly  depends  on  9. 

It  is  interesting  to  look  at  the  solutions  for  the  equation  of  motion  for  a plane  pendulum  on  a (9,  9^j 
state  space  diagram  shown  in  figure  3.4.  The  curves  shown  are  equally-spaced  contours  of  constant  total 
energy.  Note  that  the  trajectories  are  ellipses  only  at  very  small  angles  where  1 — cos#  « 92,  the  contours  are 
non-elliptical  for  higher  amplitude  oscillations.  When  the  energy  is  in  the  range  0 < E < 2mgl  the  motion 
corresponds  to  oscillations  of  the  pendulum  about  9 = 0.  The  center  of  the  ellipse  is  at  (0, 0)  which  is  a 
stable  equilibrium  point  for  the  oscillation.  However,  when  \E\  > 2mgl  there  is  a phase  change  to  rotational 
motion  about  the  horizontal  axis,  that  is,  the  pendulum  swings  around  and  over  top  dead  center,  i.e.  it 
rotates  continuously  in  one  direction  about  the  horizontal  axis.  The  phase  change  occurs  at  E = 2 mgl.  and 
is  designated  by  the  separatrix  trajectory. 

Figure  3.4  shows  two  cycles  for  9 to  better  illustrate 
the  cyclic  nature  of  the  phase  diagram.  The  closed  loops, 
shown  as  fine  solid  lines,  correspond  to  pendulum  oscil- 
lations about  9 = 0 or  27t  for  E < 2 mgl.  The  dashed 
lines  show  rolling  motion  for  cases  where  the  total  en- 
ergy E > 2 mgl.  The  broad  solid  line  is  the  separatrix 
that  separates  the  rolling  and  oscillatory  motion.  Note 
that  at  the  separatrix  the  kinetic  energy  and  9 are  zero 
when  the  pendulum  is  at  top  dead  center  which  occurs 
when  9 = ±7r.The  point  (n,  0)  is  an  unstable  equilib- 
rium characterized  by  phase  lines  that  are  hyperbolic 
to  this  unstable  equilibrium  point.  Note  that  9 = +n 
and  —7 r correspond  to  the  same  physical  point,  that  is, 
the  phase  diagram  is  better  presented  on  a cylindri- 
cal phase  space  representation  since  9 is  a cyclic  vari- 
able that  cycles  around  the  cylinder  whereas  9 oscillates 
equally  about  zero  having  both  positive  and  negative  val- 
ues. The  state-space  diagram  can  be  wrapped  around  a 
cylinder,  then  the  unstable  and  stable  equilibrium  points 
will  be  at  diametrically  opposite  locations  on  the  surface 
of  the  cylinder  at  9 = 0.  For  small  oscillations  about 
equilibrium,  also  called  librations,  the  correlation  be- 
tween 9 and  9 is  given  by  the  clockwise  closed  loops  wrapped  on  the  cylindrical  surface,  whereas  for  energies 
\E\  > 2 mgl  the  positive  9 corresponds  to  counterclockwise  rotations  while  the  negative  9 corresponds  to 
clockwise  rotations. 

State-space  diagrams  will  be  used  for  describing  oscillatory  motion  in  chapters  3 and  4.  Phase  space  is 
used  in  statistical  mechanics  in  order  to  handle  the  equations  of  motion  for  ensembles  of  ~ 1023  independent 
particles  since  momentum  is  more  fundamental  than  velocity.  Rather  than  try  to  account  separately  for 
the  motion  of  each  particle  for  an  ensemble,  it  is  best  to  specify  the  region  of  phase  space  containing  the 
ensemble.  If  the  number  of  particles  is  conserved,  then  every  point  in  the  initial  phase  space  must  transform 
to  corresponding  points  in  the  final  phase  space.  This  will  be  discussed  in  chapters  8.5  and  14.3. 


Figure  3.3:  State  space  diagram  for  a plane  pendu- 
lum. The  9 axis  is  in  units  of  it  radians.  Note  that 
9 = +7T  and  — 7r  correspond  to  the  same  physical 
point,  that  is  the  phase  diagram  should  be  rolled 
into  a cylinder  connected  at  9 = ±7 r. 
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3.5  Linearly-damped  free  linear  oscillator 

3.5.1  General  solution 

All  simple  harmonic  oscillations  are  damped  to  some  degree  due  to  energy  dissipation  via  friction,  viscous 
forces,  or  electrical  resistance  etc.  The  motion  of  damped  systems  is  not  conservative  in  that  energy  is 
dissipated  as  heat.  As  was  discussed  in  chapter  2 the  damping  force  can  be  expressed  as 

F D(v)  = -f(v)v  (3.24) 

where  the  velocity  dependent  function  f(v)  can  be  complicated.  Fortunately  there  is  a very  large  class  of 
problems  in  electricity  and  magnetism,  classical  mechanics,  molecular,  atomic,  and  nuclear  physics,  where 
the  damping  force  depends  linearly  on  velocity  which  greatly  simplifies  solution  of  the  equations  of  motion. 
Therefore  this  chapter  will  discuss  only  linear  damping. 


Consider  the  free  simple  harmonic  oscillator,  that  is,  assuming  no  oscillatory  forcing  function,  with  a 


linear  damping  term  F£>(u)  = — 6v  where  the  parameter  b is  the  damping  factor, 
motion  is 

Then  the  equation  of 

— kx  — bx  = mx 

(3.25) 

This  can  be  rewritten  as 

x + Ti;  + cu^x  = 0 

(3.26) 

where  the  damping  parameter 

r = — 

(3.27) 

m 


and  the  characteristic  angular  frequency 


The  general  solution  to  the  linearly-damped  free  oscillator  is  obtained  by  inserting  the  complex  trial 
solution  2 = zoelult.  Then 

(feu)2  zoeluJt  + icoTzoe1^  + u)QZoelut  = 0 (3.29) 

This  implies  that 


This  can  be  written  as 

z = e-(iy  [Zleiuilt  + z2e~iuilt] 

where 


(3.33) 

(3.34) 
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Underdamped  motion  u>\  = — (E  j " > 0 

When  > 0,  then  the  square  root  is  real  so  the  solution  can  be  written  taking  the  real  part  of  z which 
gives  that  equation  3.33  equals 


x(t)  = Ae  ( 2 cos  (uit  — f3) 


(3.35) 


Where  A and  (3  are  adjustable  constants  fit  to  the  initial  conditions.  Therefore  the  velocity  is  given  by 


x(t)  = —Ae  C 


u>i  sin  (wi t — /3)  + — cos  (u>i t — (3) 


(3.36) 


This  is  the  damped  sinusoidal  oscillation  illustrated  in  figure  3.5 upper.  The  solution  has  the  following 
characteristics: 

a)  The  oscillation  amplitude  decreases  exponentially  with  a time  constant  tu  = y- 

b)  There  is  a small  reduction  in  the  frequency  of  the  oscillation  due  to  the  damping  leading  to  ui i = 


The  amplitude-time  dependence  and  state-space  diagrams  for  the  free  linearly-damped  harmonic  oscillator. 
The  upper  row  shows  the  underdamped  system  for  the  case  with  damping  T = The  lower  row  shows 
the  overdamped  (I)  > wo)  [solid  line]  and  critically  damped  (^  = cuo)  [dashed  line]  in  both  cases  assuming 

that  initially  the  system  is  at  rest. 
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Figure  3.4:  Real  and  imaginary  solutions  to±  of  the  damped  harmonic  oscillator.  A phase  transition  occurs 
at  F = 2 u>o-  For  F < 2u>o  (dashed)  the  two  solutions  are  complex  conjugates  and  imaginary.  For  F > 2cco, 
(solid),  there  are  two  real  solutions  u>+  and  with  widely  different  decay  constants  where  u>+  dominates 
the  decay  at  long  times. 


Over  damped  case  ccf  = cn);  — (l()2  < 0 

In  this  case  the  square  root  of  w\  is  imaginary  and  can  be  expressed  as  ui\  = i\J (^)“  — Therefore  the 
solution  is  obtained  more  naturally  by  using  a real  trial  solution  z = in  equation  3.33  which  leads  to 
two  roots 

Thus  the  exponentially  damped  decay  has  two  time  constants  u;+  and  w_. 

x(t)  = [Aie-w+‘  + A2e~ul-t]  (3.37) 

The  time  constant  -j—  < thus  the  first  term  Aie~UJ+t  in  the  bracket  decays  in  a shorter  time  than  the 
second  term  A2e~ul~t.  As  illustrated  in  figure  3.6  the  decay  rate,  which  is  imaginary  when  underdamped,  i.e. 
§ < w0,  bifurcates  into  two  real  values  uj±  for  overdamped,  i.e.  I)  > w0.  At  large  times  the  dominant  term 
when  overdamped  is  for  oj+  which  has  the  smallest  decay  rate,  that  is,  the  longest  decay  constant  r+  = 
There  is  no  oscillatory  motion  for  the  overdamped  case,  it  slowly  moves  monotonically  to  zero  as  shown  in 
fig  3.5 lower.  The  amplitude  decays  away  with  a time  constant  that  is  longer  than  p. 

Critically  damped  w\  = ux^  — (j)~  = 0 

This  is  the  limiting  case  where  \ = uj0  For  this  case  the  solution  is  of  the  form 

x(t)  = {A  + Bt)e~^y  (3.38) 

This  motion  also  is  non-sinusoidal  and  evolves  monotonically  to  zero.  As  shown  in  figure  3.5  the  critically- 
damped  solution  goes  to  zero  with  the  shortest  time  constant,  that  is,  largest  u>.  Thus  analog  electric  meters 
are  built  almost  critically  damped  so  the  needle  moves  to  the  new  equilibrium  value  in  the  shortest  time 
without  oscillation. 

It  is  useful  to  graphically  represent  the  motion  of  the  damped  linear  oscillator  on  either  a state  space 
{x,x)  diagram  or  phase  space  (px,x)  diagram  as  discussed  in  chapter  3.4.  The  state  space  plots  for  the 
undamped,  overdamped,  and  critically-damped  solutions  of  the  damped  harmonic  oscillator  are  shown  in 
figure  3.5.  For  underdamped  motion  the  state  space  diagram  spirals  inwards  to  the  origin  in  contrast  to 
critical  or  overdamped  motion  where  the  state  and  phase  space  diagrams  move  monotonically  to  zero. 
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3.5.2  Energy  dissipation 


The  instantaneous  energy  is  the  sum  of  the  instantaneous  kinetic  and  potential  energies 

„ 1 , 1,  , 

E = —mx  H — kx 
2 2 

where  x,  and  x are  given  by  the  solution  of  the  equation  of  motion. 

Consider  the  total  energy  of  the  underdamped  system 

T-i  1 -2,1  9 9 

E = -mx  + 9rnwoa: 


(3.39) 


(3.40) 


where  k = moj?y  The  average  total  energy  is  given  by  substitution  for  x and  x and  taking  the  average  over 
one  cycle.  Since 

x(t)  = Ae~(z  )*  cos  (uit  — /3)  (3-41) 

Then  the  velocity  is  given  by 


x{t)  = -Ae'*1 


wi  sin  (uj\t  — fT)  + — cos  (tni t — (3) 


(3.42) 


Inserting  equations  3.41  and  3.42  into  3.40  gives  a small  amplitude  oscillation  about  au  exponential  decay  for 
the  energy  E.  Averaging  over  one  cycle  and  using  the  fact  that  (sin  9 cos  9)  = 0,  and  ^[sin0]2^  = ^[cos0]2^  = 
2,  gives  the  time-averaged  total  energy  as 


(E)  = e~rt 


( + -mA2 
4 


1 


(3.43) 


which  can  be  written  as 

(E)  = E0e~rt 


(3.44) 


Note  that  the  energy  of  the  linearly  damped  free  oscillator  decays  away  with  a time  constant  r = jt.  That 
is,  the  intensity  has  a time  constant  that  is  half  the  time  constant  for  the  decay  of  the  amplitude  of  the 
transient  response.  Note  that  the  average  kinetic  and  potential  energies  are  identical,  as  implied  by  the 
Virial  theorem,  and  both  decay  away  with  the  same  time  constant.  This  relation  between  the  mean  life  r 
for  decay  of  the  damped  harmonic  oscillator  and  the  damping  width  term  T occurs  frequently  in  physics. 

The  damping  of  an  oscillator  usually  is  characterized  by  a single  parameter  Q called  the  Quality  Factor 
where 


Q 


Energy  stored  in  the  oscillator 
Energy  dissipated  per  radian 


(3.45) 


The  energy  loss  per  radian  is  given  by 


A E = 


dE_ L 

dt  UJ 1 


ET 

Ui 


where  the  numerator  cji  = J ui2  — ()()2 
Thus  the  Quality  factor  Q equals 


is  the  frequency  of  the  free  damped  linear  oscillator. 


(3.46) 


Q = 


E 

AE 


r 


(3.47) 


The  larger  the  Q factor,  the  less  damped  is  the  system,  and  the 
greater  is  the  number  of  cycles  of  the  oscillation  in  the  damped 
wave  train.  Chapter  3.11.3  shows  that  the  longer  the  wave  train, 
that  is  the  higher  is  the  Q factor,  the  narrower  is  the  frequency 
distribution  around  the  central  value.  The  Mossbauer  effect  in 
nuclear  physics  provides  a remarkably  long  wave  train  that  can 
be  used  to  make  high  precision  measurements.  The  high-Q  pre- 
cision of  the  LIGO  laser  interferometer  was  used  in  the  recent 
successful  search  for  gravity  waves. 


Typical  Q factors 

Earth,  for  earthquake  wave 

250-1400 

Piano  string 

3000 

Crystal  in  digital  watch 

r 103 

Microwave  cavity 

To3 

Excited  atom 

10Y 

Neutron  star 

HP  1 

LIGO  laser 

To13 

Mossbauer  effect  in  nucleus 

To13 

Table  3.1:  Typical  Q factors  in  nature. 
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3.6  Sinusoidally-drive,  linearly-damped,  linear  oscillator 

The  linearly-damped  linear  oscillator,  driven  by  a harmonic  driving  force,  is  of  considerable  importance  to 
all  branches  of  science  and  engineering.  The  equation  of  motion  can  be  written  as 

F (t) 

x + ri:  + uiZx  = (3.48) 

m 

where  F{t)  is  the  driving  force.  For  mathematical  simplicity  the  driving  force  is  chosen  to  be  a sinusoidal 
harmonic  force.  The  solution  of  this  second-order  differential  equation  comprises  two  components,  the 
complementary  solution  ( transient  response ),  and  the  particular  solution  ( steady-state  response). 


3.6.1  Transient  response  of  a driven  oscillator 

The  transient  response  of  a driven  oscillator  is  given  by  the  complementary  solution  of  the  above  second-order 
differential  equation 

x + Tx  + u'))x  = 0 (3.49) 

which  is  identical  to  the  solution  of  the  free  linearly-damped  harmonic  oscillator.  As  discussed  in  section  3.5 
the  solution  of  the  linearly-damped  free  oscillator  is  given  by  the  real  part  of  the  complex  variable  z where 

2 = e-7*  [Zleiuit  + z2e~iu’lt]  (3.50) 

and 

Wl  “ y‘ j°  ~ (0  (3-51) 


Underdamped  motion  tuf  = uf)  — > 0 : When  uf  > 0,  then  the  square  root  is  real  so  the  transient 

solution  can  be  written  taking  the  real  part  of  2 which  gives 


x(t)T 


Fq  _£f 

— e 2 cos(wif) 
m 


(3.52) 


The  solution  has  the  following  characteristics: 

a)  The  amplitude  of  the  transient  solution  decreases  exponentially  with  a time  constant  td  = y while 
the  energy  decreases  with  a time  constant  of 


b)  There  is  a small  downward  frequency  shift  in  that  u!± 


Overdamped  case  u,?^-(§)2<  0 : In  this  case  the  square  root  is  imaginary,  which  can  be  expressed 


as  ijj\  = J (^)2  — oj ^ which  is  real  and  the  solution  is  just  an  exponentially  damped  one 


x(t)r  = —e  2‘ 
m 


eu/p  + e-wj  t 


(3.53) 


There  is  no  oscillatory  motion  for  the  overdamped  case,  it  slowly  moves  monotonically  to  zero.  The  total 
energy  decays  away  with  two  time  constants  greater  than  f. 


Critically  damped  cof  = oj), 
solution  is  of  the  form 


For  this  case,  as  mentioned  for  the  damped  free  oscillator,  the 


x{t)T  = {A  + Bt)  e 2‘ 


(3.54) 


The  critically-damped  system  decays  away  the  quickest. 
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3.6.2  Steady  state  response  of  a driven  oscillator 

The  particular  solution  of  the  differential  equation  gives  the  important  steady  state  response,  x(t)s  to  the 
forcing  function.  Consider  that  the  forcing  term  is  a single  frequency  sinusoidal  oscillation. 


F(t)  = Fq  cos(ojt) 


(3.55) 


Thus  the  particular  solution  is  the  real  part  of  the  complex  variable  z which  is  a solution  of 


Ft 


0_riuit 

m 


A trial  solution  is 

This  leads  to  the  relation 


z + Tz  + ujqZ  = —e 


z = z oe 


— uj2  Zq  T iujL Zq  T CjOqZq  — 


Fn 


m 


Multiplying  the  numerator  and  denominator  by  the  factor  (u>q  — w2)  — iTuj  gives 

Fq  Fq 


Zo 


(wg-w2)+irw  (ul  - UJ2)2  + (Tlo)- 


[((Uq  — w2)  — iFw] 


The  steady  state  solution  x(t)s  thus  is  given  by  the  real  part  of  z,  that  is 


(3.56) 

(3.57) 

(3.58) 


(3.59) 


x{t)s  = — — t—2  [(u2o -u2)  cos  ujt  + Tuj  sin ut] 

("0-“  ) +(rw) 


This  can  be  expressed  in  terms  of  a phase  S defined  as 

Toj 


tan  5 = 


U>  n — ur 


(3.60) 


(3.61) 


As  shown  in  figure  3.7  the  hypotenuse  of  the  triangle  equals 
\J (wq  — w2)2  + (Fu;)2.  Thus 


cos  5 = . C‘;°  U (3.62) 

^2-cu2)2  + (rw)2 

and 

sin  S = — = (3.63) 

7(cu^W2)2  + (FW)2 

The  phase  S represents  the  phase  difference  between  the 
driving  force  and  the  resultant  motion.  For  a fixed  ui o the 
phase  5 = 0 when  u = 0,  and  increases  to  5 = f when 
ui  = w0.  For  w > w o the  phase  5 — > tt  as  u>  — > oo. 

The  steady  state  solution  can  be  re-expressed  in  terms  of 
the  phase  shift  5 as 


Figure  3.5:  Phase  between  driving  force  and 
resultant  motion. 


x(t)s 


To 


\J{ul-u2f  + (TuY 

To 

rn 

v/(w2-^)2  + (rw)5 


[cos  S cos  ujt  + sin  5 sin  cot] 


■.  cos  (cut  — 5) 


(3.64) 
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Figure  3.6:  Amplitude  versus  time,  and  state  space  plots  of  the  transient  solution  (dashed)  and  total  solution 
(solid)  for  two  cases.  The  upper  row  shows  the  case  where  the  driving  frequency  u = y4  while  the  lower  row 
shows  the  same  for  the  case  where  the  driving  frequency  w = 5uq. 

3.6.3  Complete  solution  of  the  driven  oscillator 

To  summarize,  the  total  solution  of  the  sinusoidally  forced  linearly-damped  harmonic  oscillator  is  the  sum 
of  the  transient  and  steady-state  solutions  of  the  equations  of  motion. 

x(t)Totai  = x(t)T  + x(t)s  (3.65) 

This  for  the  underdamped  case,  the  transient  solution  is  the  complementary  solution 

rp 

x{t)T  = — -e~2t  cos  (uqt  — (3)  (3.66) 

m 

where  uq  = (£)“.  The  steady-state  solution  is  given  by  the  particular  solution 

Fa 

x(t)s  = < m cos  ( uit  — S)  (3.67) 

V (wq  - w2)2  + (To;)2 


Note  that  the  frequency  of  the  transient  solution  is  uq  which  in  general  differs  from  the  driving  frequency 
ui.  The  phase  shift  /3  — 5 for  the  transient  component  is  set  by  the  initial  conditions.  The  transient  response 
leads  to  a more  complicated  motion  immediately  after  the  driving  function  is  switched  on.  Figure  3.8 
illustrates  the  amplitude  time  dependence  and  state  space  diagram  for  the  transient  component,  and  the 
total  response,  when  the  driving  frequency  is  either  w = y or  w = 5uq.  Note  that  the  modulation  of  the 
steady-state  response  by  the  transient  response  is  unimportant  once  the  transient  response  has  damped  out 
leading  to  a constant  elliptical  state  space  trajectory.  For  cases  where  the  initial  conditions  are  x = x = 0 
then  the  transient  solution  has  a relative  phase  difference  f3  — 5 = n radians  at  t = 0 and  relative  amplitudes 
such  that  the  transient  and  steady-state  solutions  cancel  at  t = 0. 

The  characteristic  sounds  of  different  types  of  musical  instruments  depend  very  much  on  the  admixture 
of  transient  solutions  plus  the  number  and  mixture  of  oscillatory  active  modes.  Percussive  instruments,  such 
as  the  piano,  have  a large  transient  component.  The  mixture  of  transient  and  steady-state  solutions  for 
forced  oscillations  occurs  frequently  in  studies  of  RLC  networks  in  electrical  circuit  analysis. 
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3.6.4  Resonance 


The  discussion  so  far  has  discussed  the  role  of  the  transient  and  steady-state  solutions  of  the  driven  damped 
harmonic  oscillator  which  occurs  frequently  is  science,  and  engineering.  Another  important  aspect  is  reso- 
nance that  occurs  when  the  driving  frequency  w approaches  the  natural  frequency  wi  of  the  damped  system. 
Consider  the  case  where  the  time  is  sufficient  for  the  transient  solution  to  have  decayed  to  zero. 

Figure  3.9  shows  the  amplitude  and  phase  for  the  steady- 
state  response  as  w goes  through  a resonance  as  the  driving 
frequency  is  changed.  The  steady-states  solution  of  the 
driven  oscillator  follows  the  driving  force  when  w « wo  in 
that  the  phase  difference  is  zero  and  the  amplitude  is  just 
Q-.  The  response  of  the  system  peaks  at  resonance,  while 
for  w » wo  the  harmonic  system  is  unable  to  follow  the 
more  rapidly  oscillating  driving  force  and  thus  the  phase  of 
the  induced  oscillation  is  out  of  phase  with  the  driving  force 
and  the  amplitude  of  the  oscillation  tends  to  zero. 

Note  that  the  resonance  frequency  for  a driven  damped 
oscillator,  differs  from  that  for  the  undriven  damped  oscilla- 
tor, and  differs  from  that  for  the  undamped  oscillator.  The 
natural  frequency  for  an  undamped  harmonic  oscillator 
is  given  by 

o k 


w0  — — 

m 


(3.68) 


The  transient  solution  is  the  same  as  damped  free  os- 
cillations of  a damped  oscillator  and  has  a frequency  of 
the  system  w i given  by 


(3.69) 


That  is,  damping  slightly  reduces  the  frequency. 

For  the  driven  oscillator  the  maximum  value  of  the 
steady-state  amplitude  response  is  obtained  by  taking  the 
maximum  of  the  function  x(t)s,  that  is  when  = 0.  This 
occurs  at  the  resonance  angular  frequency  w#  where 


= w o 


- 2 


(3.70) 


Figure  3.7:  Resonance  behavior  for  the 
linearly-damped,  harmonically  driven,  linear 
oscillator. 


No  resonance  occurs  if  Wg  — 2 (I))2  < 0 since  then  w_r  is  imaginary  and  the  amplitude  decreases  monotonically 
with  increasing  w.  Note  that  the  above  three  frequencies  are  identical  if  F = 0 but  they  differ  when  T > 0 
with  ujr  < w i < wo- 

For  the  driven  oscillator  it  is  customary  to  define  the  quality  factor  Q as 


Q = 


Wfl 

T” 


(3.71) 


When  Q » 1 then  one  has  a narrow  high  resonance  peak.  As  the  damping  increases  the  quality  factor 
decreases  leading  to  a wider  and  lower  peak.  The  resonance  disappears  when  Q < 1 . 


3.6.5  Energy  absorption 

Discussion  of  energy  stored  in  resonant  systems  is  best  described  using  the  steady  state  solution  which  is 
dominant  after  the  transient  solution  has  decayed  to  zero.  Then 

To 

x{t)s  = — — —2  [(w^w2)coswt  + Fwsinwt]  (3.72) 

(w5-w2)  +(rw) 
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This  can  be  rewritten  as 
where  the  elastic  amplitude 

while  the  absorptive  amplitude 


x(t)s  = Aei  cos  cut  + Aabs  sin  tut 
Fo 


Apt  = 


+ (rw)' 


(«S  - ^2) 


Aa 


bs 


(rw) 


rFcU 


Figure  3.10  shows  the  behavior  of  the  absorptive  and 
elastic  amplitudes  as  a function  of  angular  frequency  cu. 
The  absorptive  amplitude  is  significant  only  near  res- 
onance whereas  the  elastic  amplitude  goes  to  zero  at 
resonance.  Note  that  the  full  width  at  half  maximum  of 
the  absorptive  amplitude  peak  equals  T. 

The  work  done  by  the  force  Fo  cos  cut  on  the  oscillator 
is 

W = J Fdx  = J Fxdt  (3.76) 

Thus  the  absorbed  power  P(t)  is  given  by 


(3.77) 

The  steady  state  response  gives  a velocity 

x(t)s  = —uAei  sin  cut  + wAabs  coscut  (3.78) 
Thus  the  steady-state  instantaneous  power  input  is 
P(t)  = F0  cos  cut  [— u>Aei  sin  cut  + u)Aabs  coscut]  (3.79) 


(3.73) 

(3.74) 

(3.75) 


CO 

!<  r 


Figure  3.8:  Elastic  (solid)  and  absorptive  (dashed) 
amplitudes  of  the  steady-state  solution  for  T = 
O.IOcuq. 


The  absorptive  term  steadily  absorbs  energy  while  the  elastic  term  oscillates  as  energy  is  alternately  absorbed 
or  emitted.  The  time  average  over  one  cycle  is  given  by 


(P) 


-cuAei  (cos  cut  sin  cut)  + c vAabs  ( (cos  cut) 


(3.80) 


where  (cos  cut  sin  cut)  and  (coscut2)  are  the  time  average  over  one  cycle.  The  time  averages  over  one  complete 
cycle  for  the  first  term  in  the  bracket  is 


-ujAei  (cos  cut  sin  cut)  = 0 


while  for  the  second  term 


1 fto+1  l 

(cos  cut2)  = — / cosc ut2dt  = - 

-L  Jt0  2 

Thus  the  time  average  power  input  is  given  by  only  the  absorptive  term 


1 F? 

(P)  = -F0u>Aabs  = -9- 

2 2m  _ w2 y _j_  (Feu)' 


Feu2 


(3.81) 

(3.82) 


(3.83) 


This  shape  of  the  power  curve  is  a classic  Lorentzian  shape.  Note  that  the  maximum  of  the  average  kinetic 
energy  occurs  at  c oke  = <u0  which  is  different  from  the  peak  of  the  amplitude  which  occurs  at  cu2  = cUq  — (£)“. 
The  potential  energy  is  proportional  to  the  amplitude  squared,  i.e.  x2s  which  occurs  at  the  same  angular 
frequency  as  the  amplitude,  that  is,  uipE  = uj2r  = cu§  — 2 (^)  . The  kinetic  and  potential  energies  resonate 
at  different  angular  frequencies  as  a result  of  the  fact  that  the  driven  damped  oscillator  is  not  conservative 
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because  energy  is  continually  exchanged  between  the  oscillator  and  the  driving  force  system  in  addition  to 
the  energy  dissipation  due  to  the  damping. 

When  w ~ wo  >>  I\  then  the  power  equation  simplifies  since 

(wq  — w2)  = (wo  + w)  (wq  — w)  « 2wq  (wq  — w)  (3.84) 


Therefore 


(P)  ~ *° 


8m 


(uo-uf  + ^y 


(3.85) 


This  is  called  the  Lorentzian  or  Breit-Wigner  shape.  The  half  power  points  are  at  a frequency  difference 
from  resonance  of  ±Aw  where 

r 

Aw  = |wq  — w | = ± — (3.86) 


Thus  the  full  width  at  half  maximum  of  the  Lorentzian  curve  equals  F.  Note  that  the  Lorentzian  has  a 
narrower  peak  but  much  wider  tail  relative  to  a Gaussian  shape.  At  the  peak  of  the  absorbed  power,  the 
absorptive  amplitude  can  be  written  as 


Aabs(w  = w0)  = — (3.87) 
m Wq 

That  is,  the  peak  amplitude  increases  with  increase  in  Q.  This  explains  the  classic  comedy  scene  where  the 
soprano  shatters  the  crystal  glass  because  the  highest  quality  crystal  glass  has  a high  Q which  leads  to  a 
large  amplitude  oscillation  when  she  sings  on  resonance. 

The  mean  lifetime  r of  the  free  linearly-damped  harmonic  oscillator,  that  is,  the  time  for  the  energy  of 
free  oscillations  to  decay  to  1/e  was  shown  to  be  related  to  the  damping  coefficient  F by 


(3.88) 


Therefore  we  have  the  classical  uncertainty  principle  for  the  linearly-damped  harmonic  oscillator 

that  the  measured  full-width  at  half  maximum  of  the  energy  resonance  curve  for  forced  oscillation  and  the 
mean  life  for  decay  of  the  energy  of  a free  linearly-damped  oscillator  are  related  by 


rF  = 1 (3.89) 

This  relation  is  correct  only  for  a linearly-damped  harmonic  system.  Comparable  relations  between  the 
lifetime  and  damping  width  exist  for  different  forms  of  damping. 

One  can  demonstrate  the  above  line  width  and  decay  time  relationship  using  an  acoustically  driven 
electric  guitar  string.  It  also  occurs  for  the  width  of  the  electromagnetic  radiation  and  the  lifetime  for  decay 
of  atomic  or  nuclear  electromagnetic  decay.  This  classical  uncertainty  principle  is  exactly  the  same  as  the 
one  encountered  in  quantum  physics  due  to  wave-particle  duality.  In  nuclear  physics  it  is  difficult  to  measure 
the  lifetime  of  states  when  r < 10-13s.  For  shorter  lifetimes  the  value  of  T can  be  determined  from  the  shape 
of  the  resonance  curve  which  can  be  measured  directly  when  the  damping  is  large. 

3.1  Example:  Harmonically- driven  series  RLC  circuit 

The  harmonically-driven,  resonant,  series  RLC  circuit,  is  encountered  fre- 
quently in  AC  circuits.  Kirchhojf’s  Rules  applied  to  the  series  RLC  circuit 
lead  to  the  differential  equation 

Lq  + Rq  + = Vo  sin  uit 

(j 

where  q is  charge,  L is  the  inductance,  C is  the  capacitance,  R is  the  resistance, 
and  the  applied  voltage  across  the  circuit  is  V(w)  = Vo  sinwf.  The  linearity  of 
the  network  allows  use  of  the  phasor  approach  which  assumes  that  the  current 
I = Ioelut,  the  voltage  V = Voe^ut+s\  and  the  impedance  is  a complex  number 


L R C 

j j - 
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Z — T^e*5  where  S is  the  phase  difference  between  the  voltage  and  the  current.  For  this  circuit  the  impedance 
is  given  by 


Z = R + i ( ujL 


is) 

Because  of  the  phases  involved  in  this  RLC  circuit,  at  resonance  the  maximum  voltage  across  the  resistor 
occurs  at  a frequency  of  ujr  = wo,  across  the  capacitor  the  maximum  voltage  occurs  at  a frequency  off-,  = 

where  Wq  = rr 


Wq  — , and  across  the  inductor  L the  maximum  voltage  occurs  at  a frequency  u>\  = 

2 L'2 

is  the  resonance  angular  frequency  when  R = 0.  Thus  these  resonance  frequencies  differ  when  R > 0. 


3.7  Wave  equation 


Wave  motion  is  a ubiquitous  feature  in  nature.  Mechanical  wave  motion  is  manifest  by  transverse  waves 
on  fluid  surfaces,  longitudinal  and  transverse  seismic  waves  travelling  through  the  Earth,  and  vibrations  of 
mechanical  structures  such  as  suspended  cables.  Acoustical  wave  motion  occurs  on  the  stretched  strings  of 
the  violin,  as  well  as  the  cavities  of  wind  instruments.  Electromagnetic  wave  motion  includes  wavelengths 
ranging  from  105?n  radiowaves,  to  10_13m  7-rays.  Matter  waves  are  a prominent  feature  of  quantum  physics. 
All  these  manifestations  of  waves  exhibit  the  same  general  features  of  wave  motion. 

Wave  motion  occurs  for  deformable  bodies  where  elastic  forces  acting  between  the  nearest-neighbor  atoms 
of  the  body  exert  time-dependent  forces  on  one  another.  Chapter  12  will  introduce  the  collective  modes  of 
motion,  called  the  normal  modes,  of  coupled,  many-body,  linear  oscillators  which  act  as  independent  modes 
of  motion.  However,  it  is  useful  to  introduce  wavemotion  at  this  juncture  because  the  equations  of  wave 
motion  are  simple,  and  wave  motion  features  prominently  in  several  chapters  of  this  book. 

Consider  a travelling  wave  in  one  dimension  for  a linear  system.  If  the  wave  is  moving,  then  the  wave 
function  'll  (&,  t)  describing  the  shape  of  the  wave,  is  a function  of  both  x and  t.  The  instantaneous  amplitude 
of  the  wave  T (x,t)  could  correspond  to  the  transverse  displacement  of  a wave  on  a string,  the  longitudinal 
amplitude  of  a wave  on  a spring,  the  pressure  of  a longitudinal  sound  wave,  the  transverse  electric  or  magnetic 
fields  in  an  electromagnetic  wave,  a matter  wave,  etc.  If  the  wave  train  maintains  its  shape  as  it  moves,  then 
one  can  describe  the  wave  train  by  the  function  / (f)  where  the  coordinate  f is  measured  relative  to  the 
shape  of  the  wave,  that  is,  it  could  correspond  to  the  phase  of  a crest  of  the  wave.  Consider  that  f(<f>  = 0), 
corresponds  to  a constant  phase,  e.g.  the  peak  of  the  travelling  pulse,  then  assuming  that  the  wave  travels 
at  a phase  velocity  v in  the  x direction  and  the  peak  is  at  x = 0 for  t = 0,  then  it  is  at  x = vt  at  time  t. 
That  is,  a point  with  phase  (f>  fixed  with  respect  to  the  waveform  shape  of  the  wave  profile  f(f)  moves  in 
the  +x  direction  for  <j>  = x — vt  and  in  — x direction  for  f = x + vt. 

General  wave  motion  can  be  described  by  solutions  of  a wave  equation.  The  wave  equation  can  be 
written  in  terms  of  the  spatial  and  temporal  derivatives  of  the  wave  function  T(a;t).  Consider  the  first  partial 
derivatives  of  ^f(xt)  = f{x  =F  vt)  = f (f). 


and 


Factoring  out  ^ for  the  first  derivatives  gives 


Consider  the  second  derivatives 


<9T  df  d'k 

(3.90) 

dx  df  dx  df 

d ^ df  dT 

dt  df  dt  df 

(3.91) 

gives 

<9T  <9T 

(3.92) 

sign  of  the  wave  velocity  making  it  not 

a generally  useful  formula. 

d2'L  d2T  df  d2’k 
dx 2 df2  dx  df2 

(3.93) 

d^_  _ cP^df  _ 2d2T 
dt2  df2  dt  df2 


and 


(3.94) 
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Factoring  out 


gives 


d2*  _ 1 d2* 

dx2  v2  dt 2 { ’ 

This  wave  equation  in  one  dimension  for  a linear  system  is  independent  of  the  sign  of  the  velocity.  There 
are  an  infinite  number  of  possible  shapes  of  waves  both  travelling  and  standing  in  one  dimension,  all  of  these 
must  satisfy  this  one-dimensional  wave  equation.  The  converse  is  that  any  function  that  satisfies  this  one 
dimensional  wave  equation  must  be  a wave  in  this  one  dimension. 

The  Wave  Equation  in  three  dimensions  is 


, d2^>  d2^>  d2<L 

V"T  = 1 1 

dx2  dy 2 dz2 


1 <92T 
v2  dt2 


(3.96) 


There  are  an  infinite  number  of  possible  solutions  ^ to  this  wave  equation,  any  one  of  which  corresponds  to 
a wave  motion  with  velocity  v. 

The  Wave  Equation  is  applicable  to  all  manifestations  of  wave  motion,  both  transverse  and  longitudinal, 
for  linear  systems.  That  is,  it  applies  to  waves  on  a string,  water  waves,  seismic  waves,  sound  waves, 
electromagnetic  waves,  matter  waves,  etc.  If  it  can  be  shown  that  a wave  equation  can  be  derived  for  any 
system,  discrete  or  continuous,  then  this  is  equivalent  to  proving  the  existence  of  waves  of  any  waveform, 
frequency,  or  wavelength  travelling  with  the  phase  velocity  given  by  the  wave  equation.  [Cra65] 


3.8  Travelling  and  standing  wave  solutions  of  the  wave  equation 

The  wave  equation  can  have  both  travelling  and  standing-wave  solutions.  Consider  a one-dimensional  trav- 
elling wave  with  velocity  v having  a specific  wavenumber  k==j-.  Then  the  travelling  wave  is  best  written 
in  terms  of  the  phase  of  the  wave  as 

T(.t,  t)  = A(k)ei3?(x*vt'>  = A{k)ei{kxTujt)  (3.97) 

where  the  wave  number  k=‘Vf-.  with  A being  the  wave  length,  and  angular  frequency  o o = kv.  This  particular 
solution  satisfies  the  wave  equation  and  corresponds  to  a travelling  wave  with  phase  velocity  v = jr-  in  the 
positive  or  negative  direction  x depending  on  whether  the  sign  is  negative  or  positive.  Assuming  that  the 
superposition  principle  applies,  then  the  superposition  of  these  two  particular  solutions  of  the  wave  equation 
can  be  written  as 

tf(s x,t)  = A(jfe)(ei(fcl-&rt)  +ei(kx+^)  = A(k)eikx{e~iMt  + eiut)  = 2 A{k)eikx  cos  u>t  (3.98) 

Thus  the  superposition  of  two  identical  single  wavelength  travelling  waves  propagating  in  opposite  directions 
can  correspond  to  a standing  wave  solution.  Note  that  a standing  wave  is  identical  to  a stationary  normal 
mode  of  the  system  discussed  in  chapter  12.  This  transformation  between  standing  and  travelling  waves  can 
be  reversed,  that  is,  the  superposition  of  two  standing  waves,  i.e.  normal  modes,  can  lead  to  a travelling 
wave  solution  of  the  wave  equation. 

Discussion  of  waveforms  is  simplified  when  using  either  of  the  following  two  limits. 

1)  The  time  dependence  of  the  waveform  at  a given  location  x = xq  which  can  be  expressed  using  a 
Fourier  decomposition,  appendix  1.2,  of  the  time  dependence  as  a function  of  angular  frequency  oj  = nuo- 

oo  oo 

^(xo,t)  = £ Ane  ^ 0 0 0 ^ — £ Bn  (®0)  e~iWot  (3.99) 

n=— oo  n=— oo 

2)  The  spatial  dependence  of  the  waveform  at  a given  instant  t = to  which  can  be  expressed  using  a 
Fourier  decomposition  of  the  spatial  dependence  as  a function  of  wavenumber  k = nko 

OO  OO 

tf(®,i0)=  Anein(~k°x-Ult^  = Cn(t0)einkox  (3.100) 

n=— oo  n=— oo 

The  above  is  applicable  both  to  discrete,  or  continuous  linear  oscillator  systems,  e.g.  waves  on  a string. 
In  summary,  stationary  normal  modes  of  a system  are  obtained  by  a superposition  of  travelling  waves 
travelling  in  opposite  directions,  or  equivalently,  travelling  waves  can  result  from  a superposition  of  stationary 
normal  modes. 
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3.9  Waveform  analysis 

3.9.1  Harmonic  decomposition 

As  described  in  appendix  /,  when  superposition  applies,  then  a 
Fourier  series  decomposition  of  the  form  3.101  can  be  made  of 
any  periodic  function  where 


x(t) 


N 


m = E an  cos(ncoot  + (j>n) 


(3.101) 


or  the  more  general  Fourier  Transform  can  be  made  for  an  ape- 
riodic function  where 

F(t)  = J a (w)  cos(wi  + <j>  (u>))dt  (3.102) 

Any  linear  system  that  is  subject  to  the  forcing  function  F(t), 
has  an  output  that  can  be  expressed  as  a linear  superposition 
of  the  solutions  of  the  individual  harmonic  components  of  the 
forcing  function.  Fourier  analysis  of  periodic  waveforms  in  terms 
of  harmonic  trigonometric  functions  plays  a key  role  in  describing 
oscillatory  motion  in  classical  mechanics  and  signal  processing 
for  linear  systems.  Fourier’s  theorem  states  that  any  arbitrary 
forcing  function  F(t)  can  be  decomposed  into  a sum  of  harmonic 

terms.  As  a consequence  two  equivalent  representations  can  be  used  to  describe  signals  and  waves;  the  first 
is  in  the  time  domain  which  describes  the  time  dependence  of  the  signal.  The  second  is  in  the  frequency 
domain  which  describes  the  frequency  decomposition  of  the  signal.  Fourier  analysis  relates  these  equivalent 
representations. 

For  example,  the  superposition  of  two  equal  intensity  har- 
monic oscillators  in  the  time  domain  is  given  by 


Figure  3.9:  The  time  and  frequency  rep- 
resentations of  a system  exhibiting  beats. 


y(t)  = 

= 2 A cos 


A cos  (uit)  + A cos  (u>2 1) 
+ U}-2' 


cos 


W 1 - W2 


t (3.103) 


which  leads  to  the  phenomenon  of  beats  as  illustrated  for  both 
the  time  domain  and  frequency  domain  by  figure  3.9. 


3.9.2 

tor 


The  free  linearly-damped  linear  oscilla- 


The  response  of  the  free,  linearly-damped,  linear  oscillator  is  one 
of  the  most  frequently  encountered  waveforms  in  science  and  thus 
it  is  useful  to  investigate  the  Fourier  transform  of  this  waveform. 
The  damped  waveform  for  the  underdamped  case,  shown  in  fig- 
ure 3.5,  is  given  by  equation  (3.35),  that  is 


/(f)  = Ae  2 1 cos  (uqf  — (5) 

/(f)  = o 


t > 0 
t < 0 


(3.104) 

(3.105) 


where  utf  = ojq  — (t))2  and  where  wo  is  the  angular  frequency  of 
the  undamped  system.  The  Fourier  transform  is  given  by 


Figure  3.10: 


G(W)  = 


w0 


The  intensity 

i2 


/(f)2 


(w2  — wf)  + (Fw)' 


[(w2  — w()  — iFw]  (3.106) 


and 

Fourier  transform  \G(uS)\z  of  the  free 
linearly-underdamped  harmonic  oscillator 
with  wq  = 10  and  damping  F = 1. 


which  is  complex  and  has  the  famous  Lorentz  form. 
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The  intensity  of  the  wave  gives 


l/WI2 

IGMI2 


A2e  rt  cos2  (wi t — 5) 


(w2  - oj\)2  + (Fee)2 


(3.107) 

(3.108) 


Note  that  since  the  average  over  27t  of  cos2  = i,  then  the  average  over  the  cos2  ( u>\t  — 6)  term  gives  the 
intensity  I ( t ) = 4j-e~rt  which  has  a mean  lifetime  for  the  decay  of  r = jt.  The  |G(w)|“  distribution  has  the 
classic  Lorentzian  shape,  shown  in  figure  3.12,  which  has  a full  width  at  half-maximum,  FWHM,  equal  to  T. 
Note  that  G (c j)  is  complex  and  thus  one  also  can  determine  the  phase  shift  S which  is  given  by  the  ratio  of 
the  imaginary  to  real  parts  of  equation  3.105,  i.e.  tand  = , fu  , , . 

J 

The  mean  lifetime  of  the  exponential  decay  of  the  intensity  can  be  determined  either  by  measuring  r 
from  the  time  dependence,  or  measuring  the  FWHM  T = ^ of  the  Fourier  transform  \G  (w)|2 . In  nuclear 
and  atomic  physics  excited  levels  decay  by  photon  emission  with  the  wave  form  of  the  free  linearly-damped, 
linear  oscillator.  Typically  the  mean  lifetime  r usually  can  be  measured  when  r > 10~12s  whereas  for 
shorter  lifetimes  the  radiation  width  T becomes  sufficiently  large  to  be  measured.  Thus  the  two  experimental 
approaches  are  complementary. 


3.9.3  Damped  linear  oscillator  subject  to  an  arbitrary  periodic  force 

Fourier’s  theorem  states  that  any  arbitrary  forcing  function  F(t)  can  be  decomposed  into  a sum  of  harmonic 
terms.  Consider  the  response  of  a damped  linear  oscillator  to  an  arbitrary  periodic  force. 

N 

F(t)  = ^2  anF o (w„)  cos  (wnt  + 6n)  (3.109) 

n= 0 

For  each  harmonic  term  uin  the  response  of  a linearly-damped  linear  oscillator  to  the  forcing  function 
F{t)  = F0  (uj)  cos is  given  by  equation  (3.65  — 67)  to  be 


x{t)Totai  = x{t)T  + x{t)s 

Fo  (u>n) 


m 


e 2 4 cos  (uit  — <5n)  + 


: COS  ( U)n t ^n) 


(3.110) 


\/(W0-Wn)2  + (rWnr 

The  amplitude  is  obtained  by  substituting  into  (3.111)  the  derived  values  from  the  Fourier  analysis. 


3.2  Example:  Vibration  isolation 

Frequently  it  is  desired  to  isolate  instrumentation  from,  the 
influence  of  horizontal  and  vertical  external  vibrations  that  exist 
in  its  environment.  One  arrangement  to  achieve  this  isolation 
is  to  mount  a heavy  base  of  mass  m on  weak  springs  of  spring 
constant  k plus  weak  damping.  The  response  of  this  system  is 
given  by  equation  3.109  which  exhibits  a resonance  at  the  angu- 
lar frequency  w2R  = cvq  — 2 (jY  associated  with  each  resonant 
frequency  wo  of  the  system.  For  each  resonant  frequency  the  sys- 
tem amplifies  the  vibrational  amplitude  for  angular  frequencies 
close  to  resonance  that  is,  below  V^wo,  ivhile  it  attenuates  the 
vibration  roughly  by  a factor  of  (^)  at  higher  frequencies.  To 
avoid  the  amplification  near  the  resonance  it  is  necessary  to  make  wo  very  much  smaller  than  the  frequency 
range  of  the  vibrational  spectrum  and  have  a moderately  high  Q value.  This  is  achieved  by  use  a very  heavy 
base  and  weak  spring  constant  so  that  wo  is  very  small.  A typical  table  may  have  the  resonance  frequency 
at  0.5 Hz  which  is  well  below  typical  perturbing  vibrational  frequencies,  and  thus  the  table  attenuates  the 
vibration  by  99%  at  5Hz  and  even  more  attenuation  for  higher  frequency  perturbations.  This  principle  is 
used  extensively  in  design  of  vibration-isolation  tables  for  optics  or  microbalance  equipment. 


RIGiO  TABLETOP 


damper 

rdr 


soft  spring 
isolators 


jlamper 


Seismic  isolation  of  an  optical  bench. 
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3.10  Signal  processing 

It  has  been  shown  that  the  response  of  the  linearly-damped  linear  oscillator,  subject  to  any  arbitrary  periodic 
force,  can  be  calculated  using  a frequency  decomposition,  (Fourier  analysis),  of  the  force,  appendix  I.  The 
response  can  equally  well  can  be  calculated  using  a time-ordered  discrete-time  sampling  of  the  pulse  shape; 
that  is,  the  Green’s  function  approach,  appendix  I . The  linearly-damped,  linear  oscillator  is  the  simplest 
example  of  a linear  system  that  exhibits  both  resonance  and  frequency-dependent  response.  Typical  physical 
linear  systems  exhibit  far  more  complicated  response  functions  with  multiple  resonances  and  corresponding 
frequency  response.  For  example,  an  automobile  suspension  system  involves  four  wheels  and  associated 
springs  plus  dampers  allowing  the  car  to  rock  sideways,  or  forward  and  backward,  in  addition  to  the  up- 
down  motion,  when  subject  to  the  forces  produced  by  a rough  road.  Similarly  a suspension  bridge  or  aircraft 
wing  can  twist  as  well  as  bend  due  to  air  turbulence,  or  a building  can  undergo  complicated  oscillations  due 
to  seismic  waves.  An  acoustic  system  exhibits  similar  complexity.  Signal  analysis  and  signal  processing  is  of 
pivotal  importance  to  elucidating  the  response  of  complicated  linear  systems  to  complicated  periodic  forcing 
functions.  This  is  used  extensively  in  engineering,  acoustics,  and  science. 

The  response  of  a low-pass  filter,  such  as  an  R-C  circuit  or  a coaxial  cable,  to  a input  square  wave,  shown  in 
figure  3.13,  provides  a simple  example  of  the  relative  advantages  of  using  the  complementary  Fourier  analysis 
in  the  frequency  domain,  or  the  Green’s  discrete-function  analysis  in  the  time  domain.  The  response  of  a 
repetitive  square-wave  input  signal  is  shown  in  the  time  domain  and  the  Fourier  transform  to  the  frequency 
domain.  The  middle  curves  show  the  time  dependence  for  the  response  of  the  low-pass  filter  to  an  impulse 

I ( t ) and  the  Fourier  transform  H(co).  The  output  of  the  low-pass  filter  can  be  calculated  by  folding  the 

input  square  wave  and  impulse  time  dependence  in  the  time  domain  as  shown  on  the  left  or  by  folding 
of  their  Fourier  transforms  shown  on  the  right.  Working  in  the  frequency  domain  the  response  of  linear 
mechanical  systems,  such  as  an  automobile  suspension  or  a musical  instrument,  as  well  as  linear  electronic 
signal  processing  systems  such  as  amplifiers,  loudspeakers  and  microphones,  can  be  treated  as  black  boxes 
having  a certain  transfer  function  H(ui,</))  describing  the  gain  and  phase  shift  versus  frequency.  That  is, 
the  output  wave  frequency  decomposition  is 

G(u>) output  = • G^Uj'jinput  (3.111) 

Working  in  the  time  domain,  the  the  low-pass  system  has  an  impulse  response  I(t)  = e_“,  which  is  the 
Fourier  transform  of  the  transfer  function  H( cj,  (f>).  In  the  time  domain 

/OO 

x(t)  ■ I(t-T)d,T  (3.112) 

-OO 

This  is  shown  schematically  in  figure  3.13.  The  Fourier  transformation  connects  the  three  quantities  in  the 
time  domain  with  the  corresponding  three  in  the  frequency  domain.  For  example,  the  impulse  response  of 
the  low-pass  filter  has  a fall  time  of  r which  is  related  by  a Fourier  transform  to  the  width  of  the  transfer 
function.  Thus  the  time  and  frequency  domain  approaches  are  closely  related  and  give  the  same  result  for 
the  output  signal  for  the  low-pass  filter  to  the  applied  square-wave  input  signal.  The  result  is  that  the 
higher-frequency  components  are  attenuated  leading  to  slow  rise  and  fall  times  in  the  time  domain. 

Analog  signal  processing  and  Fourier  analysis  were  the  primary  tools  to  analyze  and  process  all  forms  of 
periodic  motion  during  the  2Dth  century.  For  example,  musical  instruments,  mechanical  systems,  electronic 
circuits,  all  employed  resonant  systems  to  enhance  the  desired  frequencies  and  suppress  the  undesirable 
frequencies  and  the  signals  were  observed  using  analog  oscilloscopes.  The  remarkable  development  of  com- 
puting has  enabled  use  of  digital  signal  processing  leading  to  a revolution  in  signal  processing  that  has  had  a 
profound  impact  on  both  science  and  engineering.  For  example,  the  digital  oscilloscope,  which  can  sample  at 
frequencies  above  10 9Hz,  has  replaced  the  analog  oscilloscope  because  it  allows  sophisticated  analysis  of  each 
individual  signal  that  was  not  possible  using  analog  signal  processing.  For  example,  the  analog  approach 
in  nuclear  physics  involved  tiny  analog  electric  signals,  produced  by  many  individual  radiation  detectors, 
that  were  transmitted  hundreds  of  meters  via  carefully  shielded  and  expensive  coaxial  cables  to  the  data 
room  where  the  signals  were  amplified  and  signal  processed  using  analog  filters  to  maximize  the  signal  to 
noise  in  order  to  separate  the  signal  from  the  background  noise.  Stray  electromagnetic  radiation  picked  up 
via  the  cables  significantly  degraded  the  signals.  The  performance  and  limitations  of  the  analog  electronics 
severely  restricted  the  pulse  processing  capabilities.  Digital  signal  processing  has  rapidly  replaced  analog 
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Time  domain  Frequency  domain 


Figure  3.11:  Response  of  an  RC  electrical  circuit  to  an  input  square  wave.  The  upper  row  shows  the  time 
and  the  exponential-form  frequency  representations  of  the  square-wave  input  signal.  The  middle  row  gives 
the  impulse  response,  and  corresponding  transfer  function  for  the  RC  circuit.  The  bottom  row  shows  the 
corresponding  output  properties  in  both  the  time  and  frequency  domains 


signal  processing.  Analog  to  digital  detector  circuits  are  built  directly  into  the  electronics  for  each  individual 
detector  so  that  only  digital  information  needs  to  be  transmitted  from  each  detector  to  the  analysis  com- 
puters. Computer  processing  provides  unlimited  and  flexible  processing  capabilities  for  the  digital  signals 
greatly  enhancing  the  response  and  sensitivity  of  our  detector  systems.  Common  examples  of  digital  signal 
processing  are  digital  CD  and  DVD  disks. 

3.11  Wave  propagation 

Wave  motion  typically  involves  a packet  of  waves  encompassing  a finite  number  of  wave  cycles.  Information 
in  a wave  only  can  be  transmitted  by  starting,  stopping,  or  modulating  the  amplitude  of  a wave  train,  which 
is  equivalent  to  forming  a wave  packet.  For  example,  a musician  will  play  a note  for  a finite  time,  and  this 
wave  train  propagates  out  as  a wave  packet  of  finite  length.  You  have  no  information  as  to  the  frequency 
and  amplitude  of  the  sound  prior  to  the  wave  packet  reaching  you,  or  after  the  wave  packet  has  passed  you. 
The  velocity  of  the  wavelets  contained  within  the  wave  packet  is  called  the  phase  velocity.  For  a dispersive 
system  the  phase  velocity  of  the  wavelets  contained  within  the  wave  packet  is  frequency  dependent  and  the 
shape  of  the  wave  packet  travels  at  the  group  velocity  which  usually  differs  from  the  phase  velocity.  If 
the  shape  of  the  wave  packet  is  time  dependent,  then  neither  the  phase  velocity,  which  is  the  velocity  of  the 
wavelets,  nor  the  group  velocity,  which  is  the  velocity  of  an  instantaneous  point  fixed  to  the  shape  of  the 
wave  packet  envelope,  represent  the  actual  velocity  of  the  overall  wavepacket. 

A third  wavepacket  velocity,  the  signal  velocity,  is  defined  to  be  the  velocity  of  the  leading  edge  of  the 
energy  distribution,  and  corresponding  information  content,  of  the  wave  packet.  For  most  linear  systems 
the  shape  of  the  wave  packet  is  not  time  dependent  and  then  the  group  and  signal  velocities  are  identical. 
However,  the  group  and  signal  velocities  can  be  very  different  for  non-linear  systems  as  discussed  in  chapter 
4.7.  Note  that  even  when  the  phase  velocity  of  the  waves  within  the  wave  packet  travels  faster  than  the 
group  velocity  of  the  shape,  or  the  signal  velocity  of  the  energy  content  of  the  envelope  of  the  wave  packet, 
the  information  contained  in  a wave  packet  is  only  manifest  when  the  wave  packet  envelope  reaches  the 
detector  and  this  energy  and  information  travel  at  the  signal  velocity. 

The  modern  ideas  of  wave  propagation,  including  Hamilton’s  concept  of  group  velocity,  were  developed 
by  Lord  Rayleigh  when  applied  to  the  theory  of  sound[Rayl887].  The  concept  of  phase,  group,  and  signal 
velocities  played  a major  role  in  discussion  of  electromagnetic  waves  as  well  as  de  Broglie’s  development  of 
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the  concept  of  wave-particle  duality  and  the  development  of  wave  mechanics  by  Schrodinger. 

3.11.1  Phase,  group,  and  signal  velocities  of  wave  packets 

The  concepts  of  wave  packets,  as  well  as  their  phase,  group,  and  signal  velocities,  are  of  considerable  im- 
portance for  propagation  of  information  and  other  manifestations  of  wave  motion  in  science  and  engineering 
which  warrants  further  discussion  at  this  juncture. 

Consider  a particular  fc,w,  component  of  a one-dimensional  wave, 

q(x,t)  = Eei{kx±ult)  (3.113) 


The  argument  of  the  exponential  is  called  the  phase  (f>  of  the  wave  where 


<j>  = kx  — cut 


(3.114) 


If  we  move  along  the  x axis  at  a velocity  such  that  the  phase  is  constant  then  we  perceive  a stationary 
wave.  The  velocity  of  this  wave  is  called  the  phase  velocity.  To  ensure  constant  phase  we  require  that  <p 
is  constant  or,  assuming  real  k and  w 

u>dt  = kdx  (3.115) 

Therefore  the  phase  velocity  is  defined  to  be 

Vphase  = (3.116) 

The  velocity  we  have  used  so  far  is  just  the  phase  velocity  of  the  individual  wavelets  at  the  carrier  frequency. 
If  k or  w are  complex  then  one  must  take  the  real  parts  to  ensure  that  the  velocity  is  real. 

If  the  phase  velocity  of  a wave  is  dependent  on  the  wavelength,  that  is,  vphase  (&) , then  the  system  is 
said  to  be  dispersive  in  that  the  wave  is  dispersed  according  the  wavelength.  The  simplest  illustration  of 
dispersion  is  the  refraction  of  light  in  glass  prism  which  leads  to  dispersion  of  the  light  into  the  spectrum 
of  wavelengths.  Dispersion  leads  to  development  of  wave  packets  that  travel  at  group  and  signal  velocities 
that  usually  differ  from  the  phase  velocity.  To  illustrate  this  consider  two  equal  amplitude  travelling  waves 
having  slightly  different  wave  number  k and  angular  frequency  w.  Superposition  of  these  waves  gives 


q(x  t)  = A(^ei[kx-u}t]  + ei[(k+/±k)x-(u+Au))t]^ 

= AeiKfc+^t>*-("+T4)t  1 • + e 

= 2 AC[{k+^)x-{w+^r)t]  cos[—x  - — t] 


(3.117) 


This  corresponds  to  a wave  with  the  average  carrier  frequency  modulated  by  the  cosine  term  which  has  a 
wavenumber  of  4r  and  angular  frequency  that  is,  this  is  the  usual  example  of  beats.  The  cosine  term 
modulates  the  average  wave  producing  wave  packets  as  shown  in  figure  3.11.  The  velocity  of  these  wave 
packets  is  called  the  group  velocity  given  by  requiring  that  the  phase  of  the  modulating  term  is  constant, 
that  is 


A k , Aw 

——ax  = ——at 
2 2 


(3.118) 


Thus  the  group  velocity  is  given  by 


dx  Aw 
Vgroup  =M  = A k 


(3.119) 


If  dispersion  is  present  then  the  group  velocity  vgroup  = ^ does  not  equal  the  phase  velocity  vphaSe  = j- 
Expanding  the  above  example  to  superposition  of  n waves  gives 


q(x,  f)=]T  Arei{-krx±urt)  (3.120) 

r= 1 


In  the  event  that  n — > oo  and  the  frequencies  are  continuously  distributed,  then  the  summation  is  replaced 
by  an  integral 
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/OO 

A(k)ei{kx±UJt)dk  (3.121) 

-OO 

where  the  factor  A (k)  represents  the  distribution  amplitudes  of  the  component  waves,  that  is  the  spectral 
decomposition  of  the  wave.  This  is  the  usual  Fourier  decomposition  of  the  spatial  distribution  of  the  wave. 

Consider  an  extension  of  the  linear  superposition  of  two  waves  to  a well  defined  wave  packet  where  the 
amplitude  is  nonzero  only  for  a small  range  of  wavenumbers  ko  ± A k. 

j-ko+Ak 

q(x,t)=  A(k)e^kx-^dk 

J k^  — Ak 

This  functional  shape  is  called  a wave  packet  which  only  has  meaning  if  Afc  <<  ko 
can  be  expressed  by  making  a Taylor  expansion  around  ko 

u)(k)  = w(fc0)  + (k-ko)  + ...  (3.123) 

For  a linear  system  the  phase  then  reduces  to 


(3.122) 

. The  angular  frequency 


kx  — u>t  = ( kox  — coot)  + {k  — ko)x  ~ ( -yr  ) (&  — ko)t 

drZ  J 


(3.124) 


fe0 


The  summation  of  terms  in  the  exponent  given  by  3.125  leads  to  the  amplitude  3.123  having  the  form  of  a 
product  where  the  integral  becomes 


q(x,  t)  = e^koX-Uot^ 


pko+Ak 


J ko  — Ak 


A(k)e 


;(fc— fco)b-(^)  t] 


dk 


(3.125) 


The  integral  term  modulates  the  el(kox-u0t)  grs^  term. 

The  group  velocity  is  defined  to  be  that  for  which  the  phase  of  the  exponential  term  in  the  integral  is 
constant.  Thus 

Vgroup  = (3.126) 


Since  w = kvphase  then 


Vgroup  — Vphase 


+ k 


dk 


(3.127) 


For  non-dispersive  systems  the  phase  velocity  is  independent  of  the  wave  number  k or  angular  frequency  w 
and  thus  vgr0up  = Vphase ■ The  case  discussed  earlier,  equation  (3.103) , for  beating  of  two  waves  gives  the 
same  relation  in  the  limit  that  Aw  and  A k are  infinitessimal. 

The  group  velocity  of  a wave  packet  is  of  physical  significance  for  dispersive  media  where  vgroup  = 
(<n<:)ko  ^ ¥ = vphase'  Every  wave  train  has  a finite  extent  and  thus  we  usually  observe  the  motion  of  a 
group  of  waves  rather  than  the  wavelets  moving  within  the  wave  packet.  In  general,  for  non-linear  dispersive 
systems  the  derivative  dVpg£"e  can  be  either  positive  or  negative  and  thus  in  principle  the  group  velocity 
can  either  be  greater  than,  or  less  than,  the  phase  velocity.  Moreover,  if  the  group  velocity  is  frequency 
dependent,  that  is,  when  group  velocity  dispersion  occurs,  then  the  overall  shape  of  the  wave  packet  is  time 
dependent  and  thus  the  speed  of  a specific  relative  location  defined  by  the  shape  of  the  envelope  of  the  wave 
packet  does  not  represent  the  signal  velocity  of  the  wave  packet.  Brillouin  showed  that  the  distribution 
of  the  energy,  and  corresponding  information  content,  in  any  wave  packet  travels  at  the  signal  velocity 
which  can  be  different  from  the  group  velocity  if  the  shape  of  the  envelope  of  the  wave  packet  is  time 
dependent.  For  electromagnetic  waves  one  has  the  possibility  that  the  group  velocity  vgroup  > vphase  = c.  In 
1914  Brillouin [Bril4]  [Bri60]  showed  that  the  signal  velocity  of  electromagnetic  waves,  defined  by  the  leading 
edge  of  the  time-dependent  envelope  of  the  wave  packet,  never  exceeds  c even  though  the  group  velocity 
corresponding  to  the  velocity  of  the  instantaneous  shape  of  the  wave  packet  may  exceed  c.  Thus,  there  is 
no  violation  of  Einstein’s  fundamental  principle  of  relativity  that  the  velocity  of  an  electromagnetic  wave 
cannot  exceed  c. 
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3.3  Example:  Water  waves  breaking  on  a beach 


The  concepts  of  phase  and  group  velocity  are  illustrated  by  the  example  of  water  waves  moving  at  velocity 
v incident  upon  a straight  beach  at  an  angle  a to  the  shoreline.  Consider  that  the  wavepacket  comprises 
many  wavelengths  of  wavelength  X.  During  the  time  it  takes  the  wave  to  travel  a distance  A,  the  point  where 
the  crest  of  one  wave  breaks  on  the  beach  travels  a distance  along  beach.  Thus  the  phase  velocity  of  the 
crest  of  the  one  wavelet  in  the  wave  packet  is 


Vphase 


V 

COS  a 


The  velocity  of  the  wave  packet  along  the  beach  equals 


V group  — C COS  Oi 

Note  that  for  the  wave  moving  parallel  to  the  beach  a = 0 and  vphaSe  = vgroup  = v.  However,  for  a = f 
v phase  oo  and  Vgroup  — > 0 .In  general  for  waves  breaking  on  the  beach 

VphaseV group  V 

The  same  behavior  is  exhibited  by  surface  waves  bouncing  off  the  sides  of  the  Erie  canal,  sound  waves  in 
a trombo7ie,  and  electromagnetic  waves  transmitted  down  a rectangular  wave  guide.  In  the  latter  case  the 
phase  velocity  exceeds  the  velocity  of  light  c in  apparent  violation  of  Einstein’s  theory  of  relativity.  However, 
the  information  travels  at  the  signal  velocity  which  is  less  than  c. 


3.4  Example:  Surface  waves  for  deep  water 


In  the  "Theory  of  Sound"  Rayleigh  discusses  the  example  of  surface  waves  for  water  where  he  derives  a 
dispersion  relation  for  the  phase  velocity  vphaSe  and  wavenumber  k which  are  related  to  the  density  p,  depth 
l,  gravity  g,  and  surface  tension  T,  by 

Th  3 

ur  = gk-\ tanh  (hi) 


For  deep  water  where  the  wavelength  is  short  compared  with  the  depth,  that  is  kl  » 1 , then  tanh(kl)  — * 1 
and  the  dispersion  relation  is  given  approximately  by 


or  = gk  + 


Tk3 


For  long  surface  waves  for  deep  water,  that  is,  small  k,  then  the  gravitational  first  term  in  the  dispersion 
relation  dominates  and  the  group  velocity  is  given  by 


Vgroup  — 


1 j~g  1 OJ  Vphase 

2 V k~  2k~  2 


That  is,  the  group  velocity  is  half  of  the  phase  velocity.  Here  the  wavelets  are  building  at  the  back  of  the  wave 
packet,  progress  through  the  wave  packet  and  dissipate  at  the  front.  This  can  be  demonstrated  by  dropping  a 
pebble  into  a calm  lake.  It  will  be  seen  that  the  surface  disturbance  comprises  a wave  packet  moving  outwards 
at  the  group  velocity  with  the  individual  waves  within  the  wave  packet  expanding  at  twice  the  group  velocity  of 
the  wavepacket,  that  is,  they  appear  at  the  inner  radius  of  the  wave  packet  and  disappear  at  the  outer  radius 
of  the  wave  packet. 

For  small  wavelength  ripples,  where  k is  large,  then  the  surface  tension  term  dominates  and  the  dispersion 
relation  is  approximately  given  by 


Tk3 

P 


leading  to  a group  velocity  of 


Here  the  group  velocity  exceeds  the  phase  velocity  and  wavelets  are  building  at  the  front  of  the  wave  packet 
and  dissipate  at  the  back.  Note  that  for  this  linear  system  the  Brillion  signal  velocity  equals  the  group  velocity 
for  both  gravity  and  surface  tension  waves  for  deep  water. 
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3.5  Example:  Electromagnetic  waves  in  ionosphere 

The  response  to  radio  waves  of  the  free  electron  plasma  in  the  ionosphere  provides  an  excellent  example 
that  involves  cut-off  frequency,  complex  wavenumber  k,  as  well  as  the  phase,  group,  and  signal  velocities. 
Maxwell’s  equations  give  the  most  general  wave  equation  for  electromagnetic  waves  to  be 


V2E  - ep 
V2H  - ge 


d2E 

~W 

d2H 

~w 


+ V- 


^ X j free 


where  p free  and  j free  are  the  unbound  charge  and  current  densities.  The  effect  of  the  bound  charges  and 
currents  are  absorbed  into  e and  p.  Ohm’s  Law  can  be  written  in  terms  of  the  electrical  conductivity  a which 
is  a constant 

j =<tE 

Assuming  Ohm’s  Law  plus  assuming  Pfree  = 0 , in  the  plasma  gives  the  relations 


(92E  <9E 

V'E  - ~ af‘ St 


d2H 


<9H 


V'H  - 


dt2 


dt 


0 

0 


The  third  term  in  both  of  these  wave  equations  is  a damping  term  that  leads  to  a damped  solution  of  an 
electromagnetic  wave  in  a good  conductor. 

The  solution  of  these  damped  wave  equations  can  be  solved  by  considering  an  incident  wave 


E = E0xei(a;‘-fc2) 


Substituting  for  E in  the  first  damped  wave  equation  gives 

—k2  + oo2ep  — itoap  = 0 

That  is 


ta 

uie 


7 2 2 
k = oo  eg 

In  general  k is  complex,  that  is,  it  has  real  kn  and  imaginary  ki  parts  that  lead  to  a solution  of  the  form 

E = £!oe_fe/ze*M-fcRz) 

The  first  exponential  term  is  an  exponential  damping  term  while  the  second  exponential  term  is  the  oscillating 
term. 

Consider  that  the  plasma  involves  the  motion  of  a bound  damped  electron,  of  charge  q of  mass  m,  bound 
in  a one  dimensional  atom  or  lattice  subject  to  an  oscillatory  electric  field  of  frequency  u.  Assume  that  the 
electromagnetic  wave  is  travelling  in  the  z direction  with  the  transverse  electric  field  in  the  x direction.  The 
equation  of  motion  of  an  electron  can  be  written  as 

x + Tx  + w20x  = xqE0ei(ut-k*) 

where  V is  the  damping  factor.  The  instantaneous  displacement  of  the  oscillating  charge  equals 

q 1 


X = 


to  (wg  — w2)  + iTcj 


xEge 


i(uit-kz) 


and  the  velocity  is 


ILU 


to  (wg  — vo2)  + iToo 


StE0e 


i(u)t—kz) 


Thus  the  instantaneous  current  density  is  given  by 


.]  Nqk 


Nq2 


too 


to  (oJq  — oo2)  + iToo 


xEoe 


i(ojt—kz ) 
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therefore  the  electrical  conductivity  is  given  by 


Nq 2 


no 


m (wq  — oo2)  + iTco 

Let  us  consider  only  unbound  charges  in  the  plasma,  that  is  let  uq  = 0.  Then  the  conductivity  is  given  by 


Nq 2 


IU) 


m iVijj  — w2 


For  a low  density  ionized  plasma  u »T  thus  the  conductivity  is  given  approximately  by 


mui 


Since  a is  pure  imaginary,  then  j and  E have  a phase  difference  of  § which  implies  that  the  average  of 
the  Joule  heating  over  a complete  period  is  (j  • E)  = 0.  Thus  there  is  no  energy  loss  due  to  Joule  heating 
implying  that  the  electromagnetic  energy  is  conserved. 

Substitution  of  a into  the  relation  for  k2 


7 2 2 

k = w ep 

Define  the  Plasma  oscillation  frequency  cop  to  be 


2 

Nq2  1 

1 - — 

= UJ  SfJL 

1 2 

L0£_ 

wp  = 


then  k 2 can  be  written  as 


(a) 


For  a low  density  plasma  the  dielectric  constant  ke  — 1 and  the  relative  permeability  kb  — 1 and  thus 
e = Kp£ o — £q  and  p = kbPo  — ho-  The  velocity  of  light  in  vacuum  c = ^==.  Thus  for  low  density 
equation  a can  be  written  as 


u J 


2 


2 1.2 


= Up  + c k 


(0) 


Differentiation  of  equation  j3  with  respect  to  k gives  = 2 c2k.That  is,  vphasevgrouP  = c2  and  the  phase 

velocity  is 

Vphase  — 


There  are  three  cases  to  consider. 
1)  ui  > cop  : For  this  case 


l-(lf)5 


> 1 and  thus  k is  a pure  real  number.  Therefore  the  elec- 


tromagnetic wave  is  transmitted  with  a phase  velocity  that  exceeds  c while  the  group  velocity  is  less  than 


c. 

2)  u>  < u>p  : 


For  this  case 


< 1 and  thus  k is  a pure  imaginary  number.  Therefore  the 


electromagnetic  wave  is  not  transmitted  and  in  the  ionosphere  it  is  attenuated  rapidly  as  )z.  However, 
since  there  are  no  Joule  heating  losses  then  the  electromagnetic  wave  must  be  complete  reflected.  Thus  the 
Plasma  oscillation  frequency  serves  as  a cut-off  frequency.  For  this  example  the  signal  and  group  velocities 
are  identical. 

For  the  ionosphere  N = 10~ 11  electrons /m3 , which  corresponds  to  a Plasma  oscillation  frequency  of 
v = ujp/2-n  = 3 MHz.  Thus  electromagnetic  waves  in  the  AM  waveband  ( < l.GMHz)  are  totally  reflected  by 
the  ionosphere  and  bounce  repeatedly  around  the  Earth,  whereas  for  VHF  frequencies  above  3MHz,  the  waves 
are  transmitted  and  refracted  passing  through  the  atmosphere.  Thus  light  is  transmitted  by  the  ionosphere. 
By  contrast,  for  a good  conductor  like  silver,  the  Plasma  oscillation  frequency  is  around  10 16 Hz  which  is 
in  the  far  ultraviolet  part,  of  the  spectrum.  Thus,  all  lower  frequencies,  such  as  light,  are  totally  reflected 
by  such  a good  conductor,  whereas  X-rays  have  frequencies  above  the  Plasma  oscillation  frequency  and  are 
transmitted. 
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3.11.2  Fourier  transform  of  wave  packets 


The  relation  between  the  time  distribution  and  the  cor- 
responding frequency  distribution,  or  equivalently,  the 
spatial  distribution  and  the  corresponding  wave-number 
distribution,  are  of  considerable  importance  in  discus- 
sion of  wave  packets  and  signal  processing.  It  directly 
relates  to  the  uncertainty  principle  that  is  a characteris- 
tic of  all  forms  of  wave  motion.  The  relation  between  the 
time  and  corresponding  frequency  distribution  is  given 
via  the  Fourier  transform  discussed  in  appendix  /.  The 
following  are  two  examples  of  the  Fourier  transforms  of 
typical  but  rather  different  wavepacket  shapes  that  are 
encountered  frequently  in  science  and  engineering. 

3.6  Example:  Fourier  transform  of  a 
Gaussian  wave  packet: 

Assuming  that  the  amplitude  of  the  wave  is  a 
Gaussian  wave  packet  shown  in  the  adjacent  figure  where 

W-yo)2 

G (w)  = ce  2<A 
This  leads  to  the  Fourier  transform 


Fourier  transform  of  a Gaussian  frequency 
distribution. 


z 

f (t)  = cVfraue^^—  cos  (wot) 

Note  that  the  wavepacket  has  a standard  deviation  for  the  amplitude  of  the  wavepacket  of  at  = -f-  , that 
is  <jt  • <ju  = 1.  The  Gaussian  wavepacket  results  in  the  minimum  product  of  the  standard  deviations  of  the 
frequency  and  time  representations  for  a wavepacket.  This  has  profound  importance  for  all  wave  phenomena, 
and  especially  to  quantum  mechanics.  Because  matter  exhibits  wave-like  behavior , the  above  property  of  wave 
packet  leads  to  Heisenberg’s  Uncertainty  Principle.  For  signal  processing,  it  shows  that  if  you  truncate  a 
wavepacket  you  will  broaden  the  frequency  distribution. 


3.7  Example:  Fourier  transform  of  a rectangular  wave  packet: 

Assume  unity  amplitude  of  the  frequency  distribution  between  wo  — Aw  < w < wo  + Aw  , that  is,  a single 
isolated  square  pulse  of  width  r that  is  described  by  the  rectangidar  function  II  defined  as 


n(w)  = 


i 

o 


|w  — wo  | < Aw 
|w  — Wo  | > Aw 


Then  the  Fourier  transform  us  given  by 


fit) 


sin  A wt 
A wt 


COS  Wot 


That  is,  the  transform  of  a rectangular  wavepacket  gives  a cosine  wave  modulated  by  an  unnormalized 
sine  function  which  is  a nice  example  of  a simple  wave  packet.  That  is,  on  the  right  hand  side  we  have 
a wavepacket  At  = ±^j  wide.  Note  that  the  product  of  the  two  measures  of  the  widths  Aw  • At  = ±7r. 
Example  1.2  considers  a rectangidar  pulse  of  unity  amplitude  between  — ^ < t < | which  resulted  in  a 

Fourier  transform  G(w)  = r ^S1'L:2  V That  is,  for  a pulse  of  width  At  = ±|  the  frequency  envelope  has 
the  first  zero  at  Aw  = ± f • Note  that  this  is  the  complementary  system  to  the  one  considered  here  which  has 
Aw  • At  = ±7r  illustrating  the  symmetry  of  the  Fourier  transform  and  its  inverse. 
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3.11.3  Wave-packet  Uncertainty  Principle 

The  Uncertainty  Principle  states  that  for  all  types  of  wave  motion  there  is  a minimum  product  of  the 
uncertainty  in  the  width  of  a wave  packet  and  the  distribution  width  of  the  frequency  decomposition  of 
the  wave  packet.  This  was  illustrated  by  the  Fourier  transforms  of  wave  packets  discussed  above  where  it 
was  shown  the  product  of  the  widths  is  minimized  for  a Gaussian-shaped  wave  packet.  The  Uncertainty 
Principle  implies  that  to  make  a precise  measurement  of  the  frequency  of  a sinusoidal  wave  requires  that  the 
wave  packet  be  infinitely  long.  If  the  length  of  the  wave  packet  is  reduced  then  the  frequency  distribution 
broadens.  Then  the  crucial  aspect  needed  for  this  discussion,  is  that,  for  the  amplitudes  of  any  wavepacket, 

the  standard  deviations  a (t)  = y ( t 2)  — ( t )2  characterizing  the  width  of  the  spectral  distribution  in  the 
angular  frequency  domain,  a a(oj),  and  the  width  in  time  u Ait)  are  related  : 

c 7A{t ) • <Ja(u>)  ^ 1 (Relation  between  amplitude  uncertainties.) 

This  product  of  the  standard  deviations  equals  unity  only  for  the  special  case  of  Gaussian-shaped  spectral 
distributions,  and  is  greater  than  unity  for  all  other  shaped  spectral  distributions. 

The  intensity  of  the  wave  is  the  square  of  the  amplitude  leading  to  standard  deviation  widths  for  a 
Gaussian  distribution  where  07(f)2  = ^^(f)2,  that  is,  07(f)  = • Thus  the  standard  deviations  for  the 

spectral  distribution  and  width  of  the  intensity  of  the  wavepacket  are  related  by: 

(Uncertainty  principle  for  frequency-time  intensities) 

This  states  that  the  uncertainties  with  which  you  can  simultaneously  measure  the  time  and  frequency 
for  the  intensity  of  a given  wavepacket  are  related.  If  you  try  to  measure  the  frequency  within  a short  time 
interval  07(f)  then  the  uncertainty  in  the  frequency  measurement  oy( u>)  ^ 2a-]  {t)  • Accurate  measurement 
of  the  frequency  requires  measurement  times  that  encompass  many  cycles  of  oscillation,  that  is,  a long 
wavepacket. 

Exactly  the  same  relations  exist  between  the  spectral  distribution  as  a function  of  wavenumber  kx  and  the 
spatial  dependence  of  a wave  x which  are  conjugate  representations.  Thus  the  spectral  distribution  plotted 
versus  kx  is  directly  related  to  the  amplitude  as  a function  of  position  x ; the  spectral  distribution  versus  ky  is 
related  to  the  amplitude  as  a function  of  y;  and  the  kz  spectral  distribution  is  related  to  the  spatial  dependence 
on  z.  Following  the  same  arguments  discussed  above,  the  standard  deviation,  Gj(kx)  characterizing  the  width 
of  the  spectral  intensity  distribution  of  kx,  and  the  standard  deviation  07(2;),  characterizing  the  spatial 
width  of  the  wave  packet  intensity  as  a function  of  x,  are  related  by  the  Uncertainty  Principle  for  position- 
wavenumber.  Thus  in  summary  the  uncertainty  principle  for  the  intensity  of  wave  motion  is, 

07(f)  • o/(w)  ^ ^ (3.128) 

ai(x)  • ai(kx)  > ^ 07(2/)  • G^ky)  > ^ gi(z)  ■ o-/(fcz)  ^ ^ 

This  applies  to  all  forms  of  wave  motion , be  they,  sound  waves,  water  waves,  electromagnetic  waves,  or 
matter  waves. 

As  discussed  in  chapter  17,  the  transition  to  quantum  mechanics  involves  relating  the  matter-wave  prop- 
erties to  the  energy  and  momentum  of  the  corresponding  particle.  That  is,  in  the  case  of  matter  waves, 
multiplying  both  sides  of  equation  3.129  by  K and  using  the  de  Broglie  relations  gives  that  the  particle  en- 
ergy is  related  to  the  angular  frequency  by  E = hu>  and  the  particle  momentum  is  related  to  the  wavenumber, 
that  is  (p  = ftk  . These  lead  to  the  Heisenberg  Uncertainty  Principle: 

07(f)  • gi(E)  > ^ (3.129) 

GI(x)-aI{px)^^  Gi(y)  ■ Gi{pv)  > | 07(2)  • 07  (pz)  ^ ^ 

This  uncertainty  principle  applies  equally  to  the  wavefunction  of  the  electron  in  the 
hydrogen  atom,  proton  in  a nucleus,  as  well  as  to  a wavepacket  describing  a particle  wave  moving  along 


07(f)  • 07  M ^ ^ 
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some  trajectory.  Thus,  this  implies  that,  for  a particle  of  given  momentum,  the  wavefunction  is  spread  out 
spatially.  Planck’s  constant  li  = 1.05410~34  J-s  = 6.58210_16eU • s is  extremely  small  compared  with  energies 
and  times  encountered  in  normal  life,  and  thus  the  effects  due  to  the  Uncertainty  Principle  are  not  manifest 
for  macroscopic  dimensions. 

Confinement  of  a particle,  of  mass  m,  within  ±<r(x)  of  a fixed  location  implies  that  there  is  a corresponding 
uncertainty  in  the  momentum 


a(Px)  > 


h 

2a(x) 


(3.130) 


Now  the  variance  in  momentum  p is  given  by  the  difference  in  the  average  of  the  square  /(p  ■ p)2\,  and  the 
square  of  the  average  of  (p)2.  That  is 


er(p)2  = ((p  ■ p)2}  - (p)2 

Assuming  a fixed  average  location  implies  that  (p)  = 0,  then 


(3.131) 


(3.132) 


Since  the  kinetic  energy  is  given  by: 


Kinetic  energy  = > —7-  (Zero-point  energy) 

2 m 8 ma(ry 

This  zero-point  energy  is  the  minimum  kinetic  energy  that  a particle  of  mass  m can  have  if  confined  within  a 
distance  ±cr(r).  This  zero-point  energy  is  a consequence  of  wave-particle  duality  and  the  uncertainty  between 
the  size  and  wavenumber  for  any  wave  packet.  It  is  a quantal  effect  in  that  the  classical  limit  has  H — > 0 for 
which  the  zero-point  energy  — > 0. 

Inserting  numbers  for  the  zero-point  energy  gives  that  an  electron  confined  to  the  radius  of  the  atom, 
that  is  cr(x)  = 10~lom,  has  a zero-point  kinetic  energy  of  ~ leV.  Confining  this  electron  to  3 x lCUlom,  the 
size  of  a nucleus,  gives  a zero-point  energy  of  109eU  (1  GeV).  Confining  a proton  to  the  size  of  the  nucleus 
gives  a zero-point  energy  of  0.5MeV.  These  values  are  typical  of  the  level  spacing  observed  in  atomic  and 
nuclear  physics.  If  li  was  a large  number,  then  a billiard  ball  confined  to  a billiard  table  would  be  a blur 
as  it  oscillated  with  the  minimum  zero-point  kinetic  energy.  The  smaller  the  spatial  region  that  the  ball 
was  confined,  the  larger  would  be  its  zero-point  energy  and  momentum  causing  it  to  rattle  back  and  forth 
between  the  boundaries  of  the  confined  region.  Life  would  be  dramatically  different  if  h was  a large  number. 

In  summary,  Heisenberg’s  Uncertainty  Principle  is  a well-known  and  crucially  important  aspect  of  quan- 
tum physics.  What  is  less  well  known,  is  that  the  Uncertainty  Principle  exists  for  all  forms  of  wave  motion, 
that  is,  it  is  not  restricted  to  matter  waves.  The  following  three  examples  illustrate  application  of  the 
Uncertainty  Principle  to  acoustics,  the  nuclear  Mossbauer  effect,  and  quantum  mechanics. 

3.8  Example:  Acoustic  wave  packet 

A violinist  plays  the  note  middle  C (261.625 Hz)  with  constant  intensity  for  precisely  2 seconds.  Using 
the  fact  that  the  velocity  of  sound  in  air  is  343.2 m/s  calculate  the  following: 

1)  The  wavelength  of  the  sound  wave  in  air:  A = 343.2/261.625  = 1.312m. 

2)  The  length  of  the  wavepacket  in  air:  Wavepacket  length  = 343.2  x 2 = 686.4m 

3)  The  fractional  frequency  width  of  the  note:  Since  the  wave  packet  has  a square  pulse  shape  of  length 
t = 2 s,  then  the  Fourier  transform  is  a sine  function  having  the  first  zeros  when  sin  ^ = 0,  that  is,  An  = K 
Therefore  the  fractional  width  is  ^ = 0.0019.  Note  that  to  achieve  a purity  of  ^ = 10-6  the  violinist 
would  have  to  play  the  note  for  1.06 hours. 
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3.9  Example:  Gravitational  red  shift 

The  Mossbauer  effect  in  nuclear  physics  provides  a wave  packet  that  has  an  exceptionally  small  frac- 
tional width  in  frequency.  For  example,  the  57 Fe  nucleus  emits  a 14AkeV  deexcitation- energy  photon  which 
corresponds  to  u>  s=s  2 x 10 25rad/s  that  has  a decay  time  of  r « 10~7s.  Thus  the  fractional  width  is 
^«3x  10-18.  In  1959  Pound  and  Rebka  used  this  to  test  Einstein’s  general  theory  of  relativity  by  mea- 
surement of  the  gravitational  red  shift  between  the  attic  and  basement  of  the  22.5 m high  physics  building  at 
Harvard.  The  magnitude  of  the  predicted  relativistic  red  shift  is  = 2.5  x 10  15  which  is  what  was  observed 
with  a fractional  precision  of  about  1%. 


3.10  Example:  Quantum  baseball 

George  Gamow,  in  his  book  ”Mr.  Tompkins  in  Wonderland”,  describes  the  strange  world  that  would  exist 
if  h was  a large  number,  ,4s  an  example,  consider  you  play  baseball  in  a universe  where  h is  a large  number. 
The  pitcher  throws  a 150 g ball  20 m to  the  batter  at  a speed  of  40 m/s.  For  a strike  to  be  thrown,  the  ball’s 
position  must  be  pitched  within  the  30cm  radius  of  the  strike  zone,  that  is,  it  is  required  that  Ax  < 0.3 to. 
The  uncertainty  relation  tells  us  that  the  transverse  velocity  of  the  ball  cannot  be  less  than  Av  = 2J?ax . The 
time  of  flight  of  the  ball  from  the  mound  to  batter  is  t = 0.5s.  Because  of  the  transverse  velocity  uncertainty, 
Av,  the  ball  will  deviate  tAv  transversely  from  the  strike  zone.  This  also  must  not  exceed  the  size  of  the 
strike  zone,  that  is; 


tAv  = 


ht 


2mAx 


< 0.3m 


Combining  both  of  these  requirements  gives 


(Due  to  transverse  velocity  uncertainty) 


, 2mAx 2 
h < 


= 5.4  10-2  J • s. 


t 

This  is  32  orders  of  magnitude  larger  than  h so  quantal  effects  are  negligible, 
above  value,  then  the  pitcher  would  have  difficulty  throwing  a reliable  strike. 


However,  if  h exceeded  the 


3.12  Summary 

Linear  systems  have  the  feature  that  the  solutions  obey  the  Principle  of  Superposition,  that  is,  the  am- 
plitudes add  linearly  for  the  superposition  of  different  oscillatory  modes.  Applicability  of  the  Principle  of 
Superposition  to  a system  provides  a tremendous  advantage  for  handling  and  solving  the  equations  of  motion 
of  oscillatory  systems. 

Geometric  representations  of  the  motion  of  dynamical  systems  provide  sensitive  probes  of  periodic  mo- 
tion. Configuration  space  (q,  q,  t),  state  space  (q,  q,  t)  and  phase  space  (q,  p,t),  are  powerful  geometric 
representations  that  are  used  extensively  for  recognizing  periodic  motion  where  q,  q,  and  p are  vectors  in 
n-dimensional  space. 


Linearly-damped  free  linear  oscillator  The  free  linearly-damped  linear  oscillator  is  characterized  by 
the  equation 

x + Ti:  + w^x  = 0 (3.26) 

The  solutions  of  the  linearly-damped  free  linear  oscillator  are  of  the  form 


z = e~^y  [Zleiuilt  + z2e~iuJlt] 

The  solutions  fall  into  three  categories 

x(t)  = Ae~(?y  cos  {(jj\t  — f3)  underdamped 

x(t)  = [Aie_w+t  + A2e~u^t)  overdamped 

x(t)  = (A  + Bt ) e-(  = )* 


(3.33) 


critically  damped 
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The  energy  dissipation  for  the  linearly-damped  free  linear  oscillator  time  averaged  over  one  period  is 
given  by 

(E)  = E0e~rt  (3.44) 

The  quality  factor  Q characterizing  the  damping  of  the  free  oscillator  is  define  to  be 


Q = 


E 

~AE 


1 

r 


(3.47) 


where  A E is  the  energy  dissipated  per  radian. 


Sinusoidally-driven,  linearly-damped,  linear  oscillator  The  linearly-damped  linear  oscillator,  driven 
by  a harmonic  driving  force,  is  of  considerable  importance  to  all  branches  of  physics,  and  engineering.  The 
equation  of  motion  can  be  written  as 

F (t) 

x + Tar  + ujqX  = (3.49) 

m 

where  F(t)  is  the  driving  force.  The  complete  solution  of  this  second-order  differential  equation  comprises 
two  components,  the  complementary  solution  ( transient  response),  and  the  particular  solution  ( steady-state 
response).  That  is, 

x(t)Totai  = x(t)T  + x(t)s  (3.65) 

For  the  underdamped  case,  the  transient  solution  is  the  complementary  solution 

rp 

xU)t  = — e~  cos  (uqi  — 5)  (3.66) 

m 

and  the  steady-state  solution  is  given  by  the  particular  solution 

Ei 

x(t)s  = 171  cos  (cut  — S)  (3.67) 

V (<*>o  - w2)2  + (Tw)2 


Resonance  A detailed  discussion  of  resonance  and  energy  absorption  for  the  driven  linearly-damped  linear 
oscillator  was  given.  For  resonance  the  maximum  amplitudes  occur  at  frequencies 


undamped  free  linear  oscillator  cu0 

linearly-damped  free  linear  oscillator  uq 

driven  linearly-damped  linear  oscillator  ujr 


2 


The  energy  absorption  for  the  steady-state  solution  for  resonance  is  given  by 

x(t)s  = Aei  cos  tut  + Aabs  sincut 

where  the  elastic  amplitude 

‘lel 


El 


,2  , ,2\ 


= ?2  , ^ ,2  lW5  - 


while  the  absorptive  amplitude 


-A-abs  — 


+ (rccr 


rF  U 


The  time  average  power  input  is  given  by  only  the  absorptive  term 


(3.73) 

(3.74) 


(3.75) 


(3.133) 


This  power  curve  has  the  classic  Lorentzian  shape. 
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Wave  propagation  The  wave  equation  was  introduced  and  both  travelling  and  standing  wave  solutions 
of  the  wave  equation  were  discussed.  Harmonic  wave-form  analysis,  and  the  complementary  time-sampled 
wave  form  analysis  techniques,  were  introduced  in  this  chapter  and  in  appendix  I.  The  relative  merits  of 
Fourier  analysis  and  the  digital  Green’s  function  waveform  analysis  were  illustrated  for  signal  processing. 

The  concepts  of  phase  velocity,  group  velocity,  and  signal  velocity  were  introduced.  The  phase  velocity 
is  given  by 

Vphase  = (3.117) 

and  group  velocity 

( du\  dVphase  /q  10o\ 

Vgroup  — ^ — Vphase  T ™ 7^  (3.128) 

If  the  group  velocity  is  frequency  dependent  then  the  information  content  of  a wave  packet  travels  at  the 
signal  velocity  which  can  differ  from  the  group  velocity. 

The  Wave-packet  Uncertainty  Principle  implies  that  making  a precise  measurement  of  the  frequency  of  a 

sinusoidal  wave  requires  that  the  wave  packet  be  infinitely  long.  The  standard  deviation  a ( t ) = J ( t 2)  — ( t )2 
characterizing  the  width  of  the  amplitude  of  the  wavepacket  spectral  distribution  in  the  angular  frequency 
domain,  <7^4  (cu),  and  the  corresponding  width  in  time  <JA[t),  are  related  by  : 

( ta(1)  • ua(w)  ^ 1 (Relation  between  amplitude  uncertainties.) 

The  standard  deviations  for  the  spectral  distribution  and  width  of  the  intensity  of  the  wave  packet  are 
related  by: 

oy(t)  • oy(w)  ^ ^ (3.134) 

er i{x)  ■ <j  1{kx)  ^ ^ u/(y)  ' &i(kv)  ^ ^ <ti(z)  ■ oi(kz)  ^ ^ 

This  applies  to  all  forms  of  wave  motion,  including  sound  waves,  water  waves,  electromagnetic  waves,  or 
matter  waves. 
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Workshop  exercises 

1.  Given  below  are  a list  of  statements  followed  by  a list  of  reasons  related  to  harmonic  motion.  For  each  of  the 
statements,  determine  the  reason(s)  that  make  that  statement  true.  You  may  do  this  in  small  groups  or  as  one 
large  group-the  teaching  assistant  will  decide  what  works  best  for  your  workshop. 

Statements: 

• We  can  neglect  the  higher  order  terms  in  the  Taylor  expansion  of  F(x). 

• The  restoring  force  is  a linear  force. 

• Fq  must  vanish. 

• (dF/dx)  o is  negative  and  k is  positive. 

• We  can  write  F(x)  as  a Taylor  series  expansion. 

Reasons: 

• F(x)  depends  only  on  x. 

• A position  of  stable  equilibrium  exists  and  we  call  this  point  the  origin  of  our  coordinate  system. 

• F(x)  has  continuous  derivatives  of  all  orders. 

• The  restoring  force  is  directed  toward  the  equilibrium  position. 

• We  consider  only  small  displacements. 

2.  Second-order  ordinary  differential  equations  are  an  important  part  of  the  physics  of  the  harmonic  oscillator. 

(a)  What  do  each  of  the  following  terms  mean  with  respect  to  differential  equations? 

i.  Ordinary 

ii.  Second-order 

iii.  Homogeneous 

iv.  Linear 

(b)  Give  a mini-lesson  on  how  to  solve  second-order  differential  equations  by  working  through  the  following 
examples.  Don’t  just  provide  a solution;  explain  the  steps  leading  up  to  the  solution. 

i.  y"+5y,+6j/  = 0 

ii.  y"+y'+y  = 0 

iii.  y"+4y'+4y  = 0 

iv.  y"—3y,2x 

v.  y" — 3y'— Ay  = 2 sin  a: 

3.  Harmonic  oscillations  occur  for  many  different  types  of  systems  and  it  is  important  to  recognize  when  the 
equations  for  harmonic  motion  apply.  Three  different  systems  are  described  below.  Each  system  can  be 
approximately  described  using  the  equations  for  harmonic  motion.  Break  up  into  three  groups-one  group  per 
system.  For  your  group’s  system,  answer  the  following  questions: 

(a)  What  approximations  are  necessary  for  this  system  to  exhibit  harmonic  oscillations? 

(b)  What  is  the  differential  equation  that  governs  the  motion  of  this  system?  Use  Newton’s  second  law  to 
arrive  at  this  equation. 

(c)  What  is  the  solution  to  the  differential  equation  that  you  found  in  part  (b)? 

(d)  What  is  the  natural  frequency  of  oscillations? 

Here  are  the  three  systems: 


• A mass  m is  tied  to  a massless  spring  having  a spring  constant  k.  The  system  oscillates  in  one  dimension 
along  a horizontal  frictionless  surface. 
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• A particle  of  mass  to  is  attached  to  a weightless,  extensionless  rod  to  form  a pendulum.  The  length  of 
the  rod  is  L and  the  system  oscillates  in  a single  plane. 

• A tube  is  bent  into  the  shape  of  a U and  is  partially  filled  with  a liquid  of  density  p.  The  cross-sectional 
area  of  the  tube  is  A and  the  length  of  the  tube  filled  with  liquid  is  L.  The  liquid  is  initially  displaced  so 
that  it  is  higher  on  one  side  of  the  tube  than  the  other. 

Once  each  group  has  answered  all  of  the  questions,  share  the  results  with  the  entire  class. 

4.  Consider  a mass  to  attached  to  a spring  of  spring  constant  k.  The  spring  is  mounted  horizontally  so  that  the 
mass  oscillates  horizontally  on  a frictionless  surface.  The  spring  is  attached  to  the  wall  on  the  right  and  the 
mass  is  initially  moved  to  the  right  of  its  equilibrium  position  (compressing  the  spring)  by  a distance  S and 
released.  Working  individually,  determine  how  (if  at  all)  the  period  of  the  motion  would  be  affected  by  each  of 
the  changes  below.  Once  you  have  answered  each  part  on  your  own,  compare  your  answers  with  a classmate. 

(a)  The  spring  is  replaced  with  a stiffer  spring. 

(b)  The  mass  is  initially  displaced  a distance  s to  the  left  and  released. 

(c)  The  mass  is  replaced  with  a heavier  mass. 

(d)  The  mass  is  initially  displaced  a distance  r (r  < s)  to  the  right  and  released. 

5.  When  you  were  first  introduced  to  simple  harmonic  motion,  you  used  the  formula  mx  = —kx  to  find  the 
position  of  the  oscillating  mass  as  a function  of  time.  This  assumes  that  the  origin  is  defined  to  be  the 
equilibrium  point.  What  happens  if  this  is  not  the  case?  What  would  the  equation  of  motion  look  like?  How 
would  the  position  of  the  oscillating  mass  as  a function  of  time  change? 

6.  For  each  of  the  situations  described  below,  give  a rough  sketch  of  the  state  space  diagram  (x  versus  x ) that 
represents  the  motion  of  each  object.  All  of  the  motion  takes  place  along  the  X-axis. 

(a)  An  eggplant  is  at  rest  at  a point  on  the  +x  axis. 

(b)  A monkey  on  a skateboard  skates  with  constant  speed  in  the  negative  x direction. 

(c)  A race  car  moving  in  the  Tx  direction  undergoes  constant  acceleration  until  it  abruptly  stops. 

(d)  A cantaloupe  undergoes  simple  harmonic  motion.  The  initial  location  of  the  cantaloupe  is  at  a point  on 
the  +£  axis. 

7.  Consider  a simple  harmonic  oscillator  consisting  of  a mass  m attached  to  a spring  of  spring  constant  k.  For 
this  oscillator  x(t)  = Asin(wot  — 5). 

(a)  Find  an  expression  for  x(t). 

(b)  Eliminate  t between  x(t)  and  x(t)  to  arrive  at  one  equation  similar  to  that  for  an  ellipse. 

(c)  Rewrite  the  equation  in  part  (b)  in  terms  of  X,  X , k,  to,  and  the  total  energy  E. 

(d)  Give  a rough  sketch  of  the  phase  space  diagram  (x  versus  x)  for  this  oscillator.  Also,  on  the  same  set  of 

axes,  sketch  the  phase  space  diagram  for  a similar  oscillator  with  a total  energy  that  is  larger  than  the 

first  oscillator. 

(e)  What  direction  are  the  paths  that  you  have  sketched?  Explain  your  answer. 

(f)  Would  different  trajectories  for  the  same  oscillator  ever  cross  paths?  Why  or  why  not? 

8.  Consider  a damped,  driven  oscillator  consisting  of  a mass  771  attached  to  a spring  of  spring  constant  k. 

(a)  What  is  the  equation  of  motion  for  this  system? 

(b)  Solve  the  equation  in  part  (a).  The  solution  consists  of  two  parts,  the  complementary  solution  and  the 
particular  solution.  When  might  it  be  possible  to  safely  neglect  one  part  of  the  solution? 

(c)  What  is  the  difference  between  amplitude  resonance  and  kinetic  energy  resonance? 

(d)  How  might  phase  space  diagrams  look  for  this  type  of  oscillator?  What  variables  would  affect  the  diagram? 
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9.  A particle  of  mass  m is  subject  to  the  following  force 

F = A(x 3 — 4x2  + 3a;)x 

where  A is  a constant. 

(a)  Determine  the  points  when  the  particle  is  in  equilibrium. 

(b)  Which  of  these  points  is  stable  and  which  are  unstable? 

(c)  Is  the  motion  bounded  or  unbounded? 

10.  A very  long  cylindrical  shell  has  a mass  density  that  depends  upon  the  radial  distance  such  that  p(r ) = 
where  k is  a constant.  The  inner  radius  of  the  shell  is  a and  the  outer  radius  is  b. 

(a)  Determine  the  direction  and  the  magnitude  of  the  gravitational  field  for  all  regions  of  space. 

(b)  If  the  gravitational  potential  is  zero  at  the  origin,  what  is  the  difference  between  the  gravitational  potential 
at  r = b and  r = a? 


11.  A mass  m is  constrained  to  move  along  one  dimension.  Two  identical  springs  are  attached  to  the  mass,  one  on 
each  side,  and  each  spring  is  in  turn  attached  to  a wall.  Both  springs  have  the  same  spring  constant  k. 

(a)  Determine  the  frequency  of  the  oscillation,  assuming  no  damping. 

(b)  Now  consider  damping.  It  is  observed  that  after  n oscillations,  the  amplitude  of  the  oscillation  has 
dropped  to  one-half  of  its  initial  value.  Find  an  expression  for  the  damping  constant. 

(c)  How  long  does  it  take  for  the  amplitude  to  decrease  to  one-quarter  of  its  initial  value? 


12.  Discuss  the  motion  of  a continuous  string  when  plucked  at  one  third  of  the  length  of  the  string.  That  is,  the 


initial  condition  is  q(x,  0)  = 0,  and  q{x,  0) 


0<a;<§ 
| y(L  — x),  j<x<L 


13.  When  a particular  driving  force  is  applied  to  a stretched  string  it  is  observed  that  the  string  vibration  in  purely 
of  the  nth  harmonic.  Find  the  driving  force. 


14.  Consider  the  two-mass  system  pivoted  at  its  vertex  where  M ^ m.  It  undergoes  oscillations  of  the  angle  9 
with  respect  to  the  vertical  in  the  plane  of  the  triangle. 


(a)  Determine  the  angular  frequency  of  small  oscillations. 

(b)  Use  your  result  from  part  (a)  to  show  to'2  « y for  At  m. 

(c)  Show  that  your  result  from  part  (a)  agrees  with  to2  = U where  9e  is  the  equilibrium  angle  and  I is 
the  moment  of  inertia. 

(d)  Assume  the  system  has  energy  E.  Setup  an  integral  that  determines  the  period  of  oscillation. 

15.  A cube  of  side  a and  mass  TO  is  immersed  in  water  with  density  p past  the  point  of  equilibrium  and  then 
released.  Assume  there  is  no  damping  due  to  the  water. 

(a)  Show  that  the  cube’s  equation  of  motion  is 

/ /- x 

-j^+Ax  + B = 0. 

where  A and  B are  constants.  Determine  A and  B. 
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(b)  The  solution  to  the  equation  of  motion  is 

x(t)  =—+C i cos  (V~At)  + C 2 sin  (VAt) 
where  C\  and  C-2  are  constants.  If  a;(0)  = —a,  determine  x(t). 

(c)  Determine  the  period  T of  oscillation. 

Problems 

1.  An  unusual  pendulum  is  made  by  fixing  a string  to  a horizontal  cylinder  of  radius  R,  wrapping  the  string 
several  times  around  the  cylinder,  and  then  tying  a mass  TO  to  the  loose  end.  In  equilibrium  the  mass  hangs  a 
distance  lo  vertically  below  the  edge  of  the  cylinder.  Find  the  potential  energy  if  the  pendulum  has  swung  to 
an  angle  (j)  from  the  vertical.  Show  that  for  small  angles,  it  can  be  written  in  the  Hooke’s  Law  form  U = ^ k(f> 2. 
Comment  of  the  value  of  k. 

2.  Consider  the  two-dimensional  anisotropic  oscillator  with  motion  with  uix  = pui  and  uiy  = qui. 

a)  Prove  that  if  the  ratio  of  the  frequencies  is  rational  (that  is,  = ^ where  p and  q are  integers)  then  the 
motion  is  periodic.  What  is  the  period? 

b)  Prove  that  if  the  same  ratio  is  irrational,  the  motion  never  repeats  itself. 

3.  A simple  pendulum  consists  of  a mass  m,  suspended  from  a fixed  point  by  a weight-less,  extensionless  rod  of 
length  l. 

a)  Obtain  the  equation  of  motion,  and  in  the  approximation  sin#  « 9 , show  that  the  natural  frequency  is 
Wo  = where  9 is  the  gravitational  field  strength. 

b)  Discuss  the  motion  in  the  event  that  the  motion  takes  place  in  a viscous  medium  with  retarding  force 

2 mVgiQ- 

4.  Derive  the  expression  for  the  State  Space  paths  of  the  plane  pendulum  if  the  total  energy  is  E > 2mgl.  Note 
that  this  is  just  the  case  of  a particle  moving  in  a periodic  potential  U(6)  = mgl(l—cos6).  Sketch  the  State 
Space  diagram  for  both  E > 2 mgl  and  E < 2 mgl. 

5.  Consider  the  motion  of  a driven  linearly-damped  harmonic  oscillator  after  the  transient  solution  has  died  out, 
and  suppose  that  it  is  being  driven  close  to  resonance,  u>  = uj0. 

a)  Show  that  the  oscillator’s  total  energy  is  E = ^ mu)2A 2 . 

b)  Show  that  the  energy  A Edis  dissipated  during  one  cycle  by  the  damping  force  Fi  is  nTmcuA2 

6.  Two  masses  mi  and  m2  slide  freely  on  a horizontal  frictionless  rail  and  are  connected  by  a spring  whose  force 
constant  is  k.  Find  the  frequency  of  oscillatory  motion  for  this  system. 

7.  A particle  of  mass  to  moves  under  the  influence  of  a resistive  force  proportional  to  velocity  and  a potential  U, 
that  is  l. 

dU 

F(x,  x)  = — bx  — — 
ox 

where  b > 0 and  U(x)  = (x2  — a2)2 

a)  Find  the  points  of  stable  and  unstable  equilibrium. 

b)  Find  the  solution  of  the  equations  of  motion  for  small  oscillations  around  the  stable  equilibrium  points 

c)  Show  that  as  t — > 00  the  particle  approaches  one  of  the  stable  equilibrium  points  for  most  choices  of  initial 
conditions.  What  are  the  exceptions?  (Hint:  You  can  prove  this  without  finding  the  solutions  explicitly.) 


Chapter  4 


Nonlinear  systems  and  chaos 


4.1  Introduction 

In  nature  only  a subset  of  systems  have  equations  of  motion  that  are  linear.  Contrary  to  the  impression 
given  by  the  analytic  solutions  presented  in  undergraduate  physics  courses,  most  dynamical  systems  in  nature 
exhibit  non-linear  behavior  that  leads  to  complicated  motion.  The  solutions  of  non-linear  equations  usually 
do  not  have  analytic  solutions,  superposition  does  not  apply,  and  they  predict  phenomena  such  as  attractors, 
discontinuous  period  bifurcation,  extreme  sensitivity  to  initial  conditions,  rolling  motion,  and  chaos.  There 
have  been  some  exciting  discoveries  in  classical  mechanics  during  the  past  four  decades  associated  with  the 
recognition  that  nonlinear  systems  can  exhibit  chaos.  Chaotic  phenomena  have  been  observed  in  most  fields  of 
science  and  engineering  such  as,  weather  patterns,  fluid  flow,  motion  of  planets  in  the  solar  system,  epidemics, 
changing  populations  of  animals,  birds  and  insects,  and  the  motion  of  electrons  in  atoms.  The  complicated 
dynamical  behavior  predicted  by  non-linear  differential  equations  is  not  limited  to  classical  mechanics,  rather 
it  is  a manifestation  of  the  mathematical  properties  of  the  solutions  of  the  differential  equations  involved, 
and  thus  is  generally  applicable  to  solutions  of  first  or  second-order  non-linear  differential  equations.  It  is 
important  to  understand  that  the  systems  discussed  in  this  chapter  follow  a fully  deterministic  evolution 
predicted  by  the  laws  of  classical  mechanics,  the  evolution  for  which  is  based  on  the  prior  history.  This 
behavior  is  completely  different  from  a random  walk  where  each  step  is  based  on  a random  process.  The 
complicated  motion  of  deterministic  non-linear  systems  stems  in  part  from  sensitivity  to  the  initial  conditions. 

The  French  mathematician  Poincare  is  credited  with  being  the  first  to  recognize  the  existence  of  chaos 
during  his  investigation  of  the  gravitational  three-body  problem  in  celestial  mechanics.  At  the  end  of  the 
nineteenth  century  Poincare  noticed  that  such  systems  exhibit  high  sensitivity  to  initial  conditions  character- 
istic of  chaotic  motion,  and  the  existence  of  nonlinearity  which  is  required  to  produce  chaos.  Poincare’s  work 
received  little  notice,  in  part  it  was  overshadowed  by  the  parallel  development  of  the  Theory  of  Relativity 
and  quantum  mechanics  at  the  start  of  the  20th  century.  In  addition,  solving  nonlinear  equations  of  motion 
is  difficult,  which  discouraged  work  on  nonlinear  mechanics  and  chaotic  motion.  The  field  blossomed  in  the 
1960's  when  computers  became  sufficiently  powerful  to  solve  the  nonlinear  equations  required  to  calculate 
the  long-time  histories  necessary  to  document  chaotic  behavior. 

Laplace,  and  many  other  scientists,  believed  in  the  deterministic  view  of  nature  which  assumes  that  if  the 
position  and  velocities  of  all  particles  are  known,  then  one  can  unambiguously  predict  the  future  motion  using 
Newtonian  mechanics.  Researchers  in  many  fields  of  science  now  realize  that  this  “clockwork  universe"  is 
invalid.  That  is,  knowing  the  laws  of  nature  can  be  insufficient  to  predict  the  evolution  of  nonlinear  systems 
in  that  the  time  evolution  can  be  extremely  sensitive  to  the  initial  conditions  even  though  they  follow  a 
completely  deterministic  development.  There  are  two  major  classifications  of  nonlinear  systems  that  lead  to 
chaos  in  nature.  The  first  classification  encompasses  nondissipative  Hamiltonian  systems  such  as  Poincare’s 
three-body  celestial  mechanics  system.  The  other  main  classification  involves  driven,  damped,  non-linear 
oscillatory  systems. 

Nonlinearity  and  chaos  is  a broad  and  active  field  and  thus  this  chapter  will  focus  only  on  a few  examples 
that  illustrate  the  general  features  of  non-linear  systems.  Weak  non-linearity  is  used  to  illustrate  bifurcation 
and  asymptotic  attractor  solutions  for  which  the  system  evolves  independent  of  the  initial  conditions.  The 
common  sinusoidally-driven  linearly-damped  plane  pendulum  illustrates  several  features  characteristic  of  the 
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evolution  of  a non-linear  system  from  order  to  chaos.  The  impact  of  non-linearity  on  wavepacket  propagation 
velocities  and  the  existence  of  soliton  solutions  is  discussed.  The  example  of  the  three-body  problem  is 
discussed  in  chapter  9.  The  transition  from  laminar  flow  to  turbulent  flow  is  illustrated  by  fluid  mechanics 
discussed  in  chapter  15.8.  Analytic  solutions  of  nonlinear  systems  usually  are  not  available  and  thus  one 
must  resort  to  computer  simulations.  As  a consequence  the  present  discussion  focusses  on  the  main  features 
of  the  solutions  for  these  systems  and  ignores  how  the  equations  of  motion  are  solved. 


4.2  Weak  nonlinearity 


Most  physical  oscillators  become  non-linear  with  increase  in  amplitude  of  the  oscillations.  Consequences 
of  non-linearity  include  breakdown  of  superposition,  introduction  of  additional  harmonics,  and  complicated 
motion  having  great  sensitivity  to  the  initial  conditions  plus  chaos  as  illustrated  in  this  chapter.  Weak  non- 
linearity is  interesting  since  perturbation  theory  can  be  used  to  solve  the  non-linear  equations  of  motion. 

The  potential  energy  function  for  a linear  oscillator  has  a pure  parabolic  shape  about  the  minimum 
location,  that  is,  U = \k[x  — Xo)2  where  xo  is  the  location  of  the  minimum.  For  weak  non-linear  systems, 
where  the  amplitude  of  oscillation  Ax  about  the  minimum  is  small,  it  is  useful  to  make  a Taylor  expansion 
of  the  potential  energy  about  the  minimum.  That  is 


U{  Ax)  = U(x  0)  + AxdUjX°^ 
ax 


Ax2  d2U  (xo) 
~2l  dx2 


Ax3  d3U  (a?o)  Ax4  d4U  (xo) 
3!  dx3  4!  dx4 


(4.1) 


By  definition,  at  the  minimum  dU^.°'>  = 0,  and  thus  equation  4.1  can  be  written  as 

A,  = U(Ax)  - ^ + - - («) 

For  small  amplitude  oscillations  the  system  is  linear  if  only  the  second-order  Agy-  d term  in  equation  4.2 
is  significant.  The  linearity  for  small  amplitude  oscillations  greatly  simplifies  description  of  the  oscillatory 
motion  in  that  superposition  applies,  and  complicated  chaotic  motion  is  avoided.  For  slightly  larger  amplitude 
motion,  where  the  higher-order  terms  in  the  expansion  are  still  much  smaller  than  the  second-order  term, 
then  perturbation  theory  can  be  used  as  illustrated  by  the  simple  plane  pendulum  which  is  non  linear  since 
the  restoring  force  equals 

mgsinO  ~ mg{0  - — + — - — + ...)  (4.3) 

This  is  linear  only  at  very  small  angles  where  the  higher-order  terms  in  the  expansion  can  be  neglected. 
Consider  the  equation  of  motion  at  small  amplitudes  for  the  harmonically-driven,  linearly-damped  plane 
pendulum 

03 

9 + TO  + cuq  sin0  = 9 + T0  + cUq(0  — — ) = F0  cos  (cut)  (4.4) 

where  only  the  first  two  terms  in  the  expansion  4.3  have  been  included.  It  was  shown  in  chapter  3 that  when 
sin  9 s=s  9 then  the  steady-state  solution  of  equation  4.4  is  of  the  form 


9 (t ) = A cos  (cut  — f>)  (4.5) 

Insert  this  first-order  solution  into  equation  4.4,  then  the  cubic  term  in  the  expansion  gives  a term  cos3uit  = 
\ (cos  3c ut  + 3 cos  cut) . Thus  the  perturbation  expansion  to  third  order  involves  a solution  of  the  form 

9{t)  = A cos  (cut  — 6)  + B cos  3(cut  — S ) (4.6) 


This  perturbation  solution  shows  that  the  non-linear  term  has  distorted  the  signal  by  addition  of  the  third 
harmonic  of  the  driving  frequency  with  an  amplitude  that  depends  sensitively  on  9.  This  illustrates  that  the 
superposition  principle  is  not  obeyed  for  this  non-linear  system,  but,  if  the  non-linearity  is  weak,  perturbation 
theory  can  be  used  to  derive  the  solution  of  a non-linear  equation  of  motion. 

Figure  4.1  illustrates  that  for  a potential  U(x)  = 2x2  + x 4,  the  x4  non-linear  term  reduces  the  maximum 
amplitude  x which  makes  the  total  energy  contours  in  state-space  more  rectangular  than  the  elliptical  shape 
for  the  harmonic  oscillator  shown  in  figure  3.3.  The  solution  is  of  the  form  given  in  equation  4.6. 
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Figure  4.1:  The  left  side  shows  the  potential  energy  for  a symmetric  potential  U(x ) = 2a:2  + x4.  The  right 
side  shows  the  contours  of  constant  total  energy  on  a state-space  diagram. 


4.1  Example:  Non-linear  oscillator 

Assume  that  a non-linear  oscillator  has  a potential  given  by 

TT,  . kx2  mXx 3 

U M = “2 3 


where  X is  small.  Find  the  solution  of  the  equation  of  motion  to  first  order  in  X,  assuming  x = 0 at  t = 0. 
The  equation  of  motion  for  the  nonlinear  oscillator  is 


dU  . 2 

mx  = — — = —kx  + mXx 
dx 

If  the  mXx2  term  is  neglected,  then  the  second-order  equation  of  motion  reduces  to  a normal  linear  oscillator 
with 

xq  = A sin  (uiot  + ip) 

where 


Assume  that  the  first-order  solution  has  the  form 


x\  = xq  + Axi 


Substituting  this  into  the  equation  of  motion,  and  neglecting  terms  of  higher  order  than  X,  gives 

A2 

X\  + UJ^XI  — Xq  = — [1  — COS  (2W(ft)] 

To  solve  this  try  a particular  integral 

xi  = B + C cos  (2wot) 

and  substitute  into  the  equation  of  motion  gives 


A2  A2 

— 3wqC  cos  (2u0t)  + ujqB  = — — cos  (2uot) 


Comparison  of  the  coefficients  gives 
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The  homogeneous  equation  is 


X\  + LOqXi  = 0 


which  has  a solution  of  the  form 


x\  = D\  sin  (woi)  + A?  cos  (wot) 


Thus  combining  the  particular  and  homogeneous  solutions  gives 


x\  = (A  + XDi)  sin  (wot)  + A 


" A2  A2 

+ Do  cos  (w0t)  + —3  cos  (2 w0t) 

A(jJq  DCc^q 


The  initial  condition  x = 0 at  t = 0 then  gives 


D2  = - 


2 A2 


3w2 


and 


XA2 

x\  = {A  + XDi)  sin  (wot)  H 5- 


L/0  L 


12  , , 1 A 

- - - cos  (Wot)  + - cos  (2 Wot) 


The  constant  ( A + XDi ) is  given  by  the  initial  amplitude  and  velocity. 

This  system  is  nonlinear  in  that  the  output  amplitude  is  not  proportional  to  the  input  amplitude.  Secondly, 
a large  amplitude  second  harmonic  component  is  introduced  in  the  output  waveform;  that  is,  for  a non-linear 
system  the  gain  and  frequency  decomposition  of  the  output  differs  from  the  input.  Note  that  the  frequency 
composition  is  amplitude  dependent.  This  particular  example  of  a nonlinear  system  does  not  exhibit  chaos. 
The  Laboratory  for  Laser  Energetics  uses  nonlinear  crystals  to  double  the  frequency  of  laser  light. 


4.3  Bifurcation,  and  point  attractors 

Interesting  new  phenomena  occur  when  the  non-linearity  becomes  large,  such  as  bifurcation,  and  attractors. 
In  chapter  3 it  was  shown  that  the  state-space  diagram  (x,  x)  for  an  undamped  harmonic  oscillator  is  an 
ellipse  with  dimensions  defined  by  the  total  energy  of  the  system.  As  shown  in  figure  3. 5, for  the  damped 
harmonic  oscillator,  the  state-space  diagram  spirals  inwards  to  the  origin  due  to  dissipation  of  energy.  Non- 
linearity distorts  the  shape  of  the  ellipse  or  spiral  on  the  state-space  diagram,  and  thus  the  state-space,  or 
corresponding  phase-space,  diagrams,  provide  useful  representations  of  the  motion  of  linear  and  non-linear 
periodic  systems  that  is  used  frequently. 

The  complicated  motion  of  non-linear  systems  makes  it  necessary  to  distinguish  between  transient  and 
asymptotic  behavior.  The  damped  harmonic  oscillator  executes  a transient  spiral  motion  that  asymptotically 
approaches  the  origin.  The  transient  behavior  depends  on  the  initial  conditions,  whereas  the  asymptotic  limit 
of  the  steady-state  solution  is  a specific  location,  that  is  called  a point  attractor.  The  point  attractor  for 
damped  motion  in  the  anharmonic  potential  well 

U(x)  = 2x2  + x4  (4.7) 

is  at  the  minimum,  which  is  the  origin  of  the  state-space  diagram  as  shown  in  figure  4.1. 

The  more  complicated  one-dimensional  potential  well 

U(x)  = 8 - 4x2  ± 0.5x4  (4.8) 

shown  in  figure  4.2,  has  two  minima  that  are  symmetric  about  x = 0 with  a saddle  of  height  8. 

The  kinetic  plus  potential  energies  of  a particle  with  mass  m = 2,  released  in  this  potential,  will  be 
assumed  to  be  given  by 

E(x,  x)  = x2  + U(x)  (4.9) 

The  state-space  plot  in  figure  4.2  shows  contours  of  constant  energy  with  the  minima  at  (x,  x)  = (±2,0). 
At  slightly  higher  total  energy  the  contours  are  closed  loops  around  either  of  the  two  minima  at  x = ±2. 
For  total  energies  above  the  saddle  energy  of  8,  the  contours  are  peanut-shaped  and  are  symmetric  about 
the  origin.  Assuming  that  the  motion  is  weakly  damped,  then  a particle  released  with  total  energy  Etotai 
which  is  higher  than  Esaddie  will  follow  a peanut-shaped  spiral  trajectory  centered  at  (x,  x)  = (0,  0)  in  the 
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Figure  4.2:  The  left  side  shows  the  potential  energy  for  a bimodal  symmetric  potential  U(x)  = 8 — 4x2  + 
0.5x4.  The  right-hand  figure  shows  contours  of  the  sum  of  kinetic  and  potential  energies  on  a state-space 
diagram.  For  total  energies  above  the  saddle  point  the  particle  follows  peanut-shaped  trajectories  in  state- 
space  centered  around  (x,x)  = (0, 0).  For  total  energies  below  the  saddle  point  the  particle  will  have  closed 
trajectories  about  either  of  the  two  symmetric  minima  located  at  (x,x)  = (±2,  0).  Thus  the  system  solution 
bifurcates  when  the  total  energy  is  below  the  saddle  point. 


state-space  diagram  for  Etotai  > Esaddie-  For  Etotai  < Esaddie  there  are  two  separate  solutions  for  the  two 
minimum  centered  at  x = ±2  and  x = 0.  This  is  an  example  of  bifurcation  where  the  one  solution  for 
Etotai  > Esaddie  bifurcates  into  either  of  the  two  solutions  for  Etotai  < Esaddie- 

For  an  initial  total  energy  Etotai  > Esaddie , damping  will  result  in  spiral  trajectories  of  the  particle  that 
will  be  trapped  in  one  of  the  two  minima.  For  Etotai  > Esaddie  the  particle  trajectories  are  centered  giving 
the  impression  that  they  will  terminate  at  (x,x)  = (0,0)  when  the  kinetic  energy  is  dissipated.  However,  for 
Etotai  < Esaddie  the  particle  will  be  trapped  in  one  of  the  two  minimum  and  the  trajectory  will  terminate 
at  the  bottom  of  that  potential  energy  minimum  occurring  at  (x,x)  = (±2,0).  These  two  possible  terminal 
points  of  the  trajectory  are  called  point  attractors.  This  example  appears  to  have  a single  attractor  for 
Etotai  > Esaddie  which  bifurcates  leading  to  two  attractors  at  (x,x)  = (±2,0)  for  Etotai  < Esaddie ■ The 
determination  as  to  which  minimum  traps  a given  particle  depends  on  exactly  where  the  particle  starts  in 
state  space  and  the  damping  etc.  That  is,  for  this  case,  where  there  is  symmetry  about  the  x-axis,  when 
the  particle  has  an  initial  total  energy  Etotai  > E saddle',  then  the  initial  conditions  with  ir  radians  of  state 
space  will  lead  to  trajectories  that  are  trapped  in  the  left  minimum,  and  the  other  n radians  of  state  space 
will  be  trapped  in  the  right  minimum.  Trajectories  starting  near  the  split  between  these  two  halves  of  the 
starting  state  space  will  be  sensitive  to  the  exact  starting  phase.  This  is  an  example  of  sensitivity  to  initial 
conditions. 


4.4  Limit  cycles 

4.4.1  Poincare-Bendixson  theorem 

Coupled  first-order  differential  equations  in  two  dimensions  of  the  form 

x = f{x,y)  (4.10) 

y = g{x,y) 

occur  frequently  in  physics.  The  state-space  paths  do  not  cross  for  such  two-dimensional  autonomous  systems, 
where  an  autonomous  system  is  not  explicitly  dependent  on  time. 

The  Poincare-Bendixson  theorem  states  that,  state-space,  and  phase-space,  can  have  three  possible  paths: 

(1)  closed  paths,  like  the  elliptical  paths  for  the  undamped  harmonic  oscillator, 

(2)  terminate  at  an  equilibrium  point  as  t — > oo,  like  the  point  attractor  for  a damped  harmonic  oscillator, 

(3)  tend  to  a limit  cycle  as  t — > oo.  The  limit  cycle  is  unusual  in  that  the  periodic  motion  tends 
asymptotically  to  the  limit-cycle  attractor  independent  of  whether  the  initial  values  are  inside  or  outside 
the  limit  cycle.  The  balance  of  dissipative  forces  and  driving  forces  often  leads  to  limit-cycle  attractors, 
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Closed  path 


Point  attractor  Limit  cycle 


Figure  4.3:  The  Poincare-Bendixson  theorem  allows  the  following  three  scenarios  for  two-dimensional  au- 
tonomous systems.  (1)  Closed  paths  as  illustrated  by  the  undamped  harmonic  oscillator.  (2)  Terminate  at 
an  equilibrium  point  as  t — > oo,  as  illustrated  by  the  damped  harmonic  oscillator,  and  (3)  Tend  to  a limit 
cycle  as  t — > oo  as  illustrated  by  the  van  der  Pol  oscillator. 


especially  in  biological  applications.  Identification  of  limit-cycle  attractors,  as  well  as  the  trajectories  of  the 
motion  towards  these  limit-cycle  attractors,  is  more  complicated  than  for  point  attractors. 

4.4.2  van  der  Pol  damped  harmonic  oscillator: 

The  van  der  Pol  damped  harmonic  oscillator  illustrates  a non-linear  equation  that  leads  to  a well-studied, 
limit-cycle  attractor  that  has  important  applications  to  diverse  fields.  It  has  an  equation  of  motion  given 

by 

d2x  / 9 _ , dx  9 

& +A*  -‘li+V'O  (4.11) 

The  non-linear  /j,  ( x 2 — l)  ^ damping  term  is  unusual  in  that  the  sign  changes  when  x = 1 leading  to 
positive  damping  for  x > 1 and  negative  damping  for  x < 1.  To  simplify  equation  4.11,  assume  that  the  term 
uJqX  = x,  that  is,  Uq  = 1. 

This  equation  was  studied  extensively  during  the  1920’s  and  1930’s  by  the  Dutch  engineer,  Balthazar  van 
der  Pol,  to  describe  electronic  circuits  that  incorporate  feedback.  The  form  of  the  solution  can  be  simplified 
by  defining  a variable  y = Then  the  second-order  equation  4.11  can  be  expressed  as  two  coupled  first-order 
equations. 


V = 


dx 

dt 


| = 

It  is  advantageous  to  transform  the  (x,x)  state  space  to  polar  coordinates  in  by  setting 

x = r cos  0 
y = r sin  9 


(4.12) 

(4.13) 

(4.14) 


Therefore 

dr  dx  dy 

r—  = x—  + y— 
dt  dt  w dt 

(4.15) 

dx  dr  . dO  . 

— = — cost/  — r—  sinft 

dt  dt  dt 

(4.16) 

dy  dr  . d9  . 

— = —smd  + r—cosd 

dt  dt  dt 

(4.17) 

Similarly  for  the  angle  coordinate 
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Figure  4.4:  Solutions  of  the  van  der  Pol  system  for  y = 0.2  top  row  and  y = 5 bottom  row,  assuming  that 
u> q = 1.  The  left  column  shows  the  time  dependence  x{t).  The  right  column  shows  the  corresponding  ( x,x ) 
state  space  plots.  Upper:  Weak  nonlinearity,  y=  0.2;  At  large  times  the  solution  tends  to  one  limit 
cycle  for  initial  values  inside  or  outside  the  limit  cycle  attractor.  The  amplitude  x(t)  for  two  initial  condi- 
tions approaches  an  approximately  harmonic  oscillation.  Lower:  Strong  nonlinearity,  fi  = 5;  Solutions 
approach  a common  limit  cycle  attractor  for  initial  values  inside  or  outside  the  limit  cycle  attractor  while 
the  amplitude  x{t)  approaches  a common  approximate  square- wave  oscillation. 


Multiply  equation  4.16  by  y and  4.17  by  £ and  subtract  gives 

9 d.0  dy  dx 

r — = x— — y— 
dt  dt  y dt 


(4.18) 


Equations  4.15  and  4.18  allow  the  van 


der  Pol  equations  of  motion  to  be  written  in  polar  coordinates 


— = —y  (r2  cos2  9—1 ) r sin2  6 
dt 

dO 

— = — 1 — y(r2  cos2  9 — l)  sin  9 cos  9 


(4.19) 

(4.20) 


The  non-linear  terms  on  the  right-hand  side  of  equations  4.19  — 20  have  a complicated  form. 


96 


CHAPTER  4.  NONLINEAR.  SYSTEMS  AND  CHAOS 


Weak  non-linearity:  p « 1 

In  the  limit  that  p — ♦ 0,  the  equations  4.19,4.20  correspond  to  a circular  state-space  trajectory  similar  to 
the  harmonic  oscillator.  That  is,  the  solution  is  of  the  form 

x (t)  = p sin  (t  — to)  (4-21) 

where  p and  to  are  arbitrary  parameters.  For  weak  non-linearity,  p « 1 the  angular  equation  4.20  has  a 
rotational  frequency  that  is  unity  since  the  sin  9 cos  9 term  changes  sign  twice  per  period,  in  addition  to  the 
small  value  of  p.  For  p « 1 and  r < 1,  the  radial  equation  4.19  has  a sign  of  the  (r2  cos2  0 — l)  term  that 
is  positive  and  thus  the  radius  increases  monotonically  to  unity.  For  r > 1,  the  bracket  is  predominantly 
negative  resulting  in  a spiral  decrease  in  the  radius.  Thus,  for  very  weak  non-linearity,  this  radial  behavior 
results  in  the  amplitude  spiralling  to  a well  defined  limit-cycle  attractor  value  of  p = 2 as  illustrated  by 
the  state-space  plots  in  figure  4.4  for  cases  where  the  initial  condition  is  inside  or  external  to  the  circular 
attractor.  The  final  amplitude  for  different  initial  conditions  also  approach  the  same  asymptotic  behavior. 


Dominant  non-linearity:  p » 1 


For  the  case  where  the  non-linearity  is  dominant,  that  is  p >>  1,  then  as  shown  in  figure  4.4,  the  system 
approaches  a well  defined  attractor,  but  in  this  case  it  has  a significantly  skewed  shape  in  state-space,  while 
the  amplitude  approximates  a square  wave.  The  solution  remains  close  to  x = +2  until  y = x « +7  and 
then  it  relaxes  quickly  to  x = — 2 with  y = x « 0.  This  is  followed  by  the  mirror  image.  This  behavior  is 
called  a relaxed  vibration  in  that  a tension  builds  up  slowly  then  dissipates  by  a sudden  relaxation  process. 
The  seesaw  is  an  extreme  example  of  a relaxation  oscillator  where  the  angle  switches  spontaneously  from 
one  solution  to  the  other  when  the  difference  in  the  moment  arms  changes  sign. 

The  study  of  feedback  in  electronic  circuits  was  the  stimulus  for  study  of  this  equation  by  van  der 
Pol.  However,  Lord  Rayleigh  first  identified  such  relaxation  oscillator  behavior  in  1880  during  studies  of 
vibrations  of  a stringed  instrument  excited  by  a bow,  or  the  squeaking  of  a brake  drum.  In  his  discussion  of 
non-linear  effects  in  acoustics,  he  derived  the  equation 


x — (a  — bx2)x  + u>qX 

Differentiation  of  Rayleigh’s  equation  4.22  gives 

x — (a  — 3 bx2)x  + uj^x  = 0 


Using  the  substitution  of 


leads  to  the  relations 


x = i / — — 


X-  I 77- 


X = 


Substituting  these  relations  into  equation  4.23  gives 


a — 


3 ba  y2 


Vo 


(4.22) 

(4.23) 

(4.24) 

(4.25) 

(4.26) 


Multiplying  by  y0 


and  rearranging  leads  to  the  van  der  Pol  equation 


y-  -y2)y-u20y  = o (4.27) 

% 

The  rhythm  of  a heartbeat  driven  by  a pacemaker  is  an  important  application  where  the  self-stabilization  of 
the  attractor  is  a desirable  characteristic  to  stabilize  an  irregular  heartbeat;  the  medical  term  is  arrhythmia. 
The  mechanism  that  leads  to  synchronization  of  the  many  pacemaker  cells  in  the  heart  and  human  body  to 
an  implanted  pacemaker  is  discussed  in  chapter  12.12.  Another  biological  application  of  limit  cycles  is  the 
time  variation  of  animal  populations. 

In  summary  the  non-linear  damping  of  the  van  der  Pol  oscillator  leads  to  a self-stabilized,  single  limit- 
cycle  attractor  that  is  insensitive  to  the  initial  conditions.  The  van  der  Pol  oscillator  has  many  important 
applications  such  as  bowed  musical  instruments,  electrical  circuits,  and  human  anatomy  as  mentioned  above. 
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4.5  Harmonically-driven,  linearly-damped,  plane  pendulum 


The  harmonically-driven,  linearly-damped,  plane  pendulum  illustrates  many  of  the  phenomena  exhibited  by 
non-linear  systems  as  they  evolve  from  ordered  to  chaotic  motion.  It  illustrates  the  remarkable  fact  that 
determinism  does  not  imply  either  regular  behavior  or  predictability.  The  well-known,  harmonically-driven 
linearly-damped  pendulum  provides  an  ideal  basis  for  an  introduction  to  non-linear  dynamics1 . 

Consider  a harmonically-driven  linearly-damped  plane  pendulum  of  moment  of  inertia  I and  mass  m in 
a gravitational  field  that  is  driven  by  a torque  due  to  a force  F(t)  = Fp  cos  u>t  at  a moment  arm  L.  The 
damping  term  is  b and  the  angular  displacement  of  the  pendulum,  relative  to  the  vertical,  is  9.  The  equation 
of  motion  of  the  harmonically-driven  linearly-damped  simple  pendulum  can  be  written  as 

Id  + bO  + mgL  sin  9 = LFD  cos  cot  (4.28) 


Note  that  the  sinusoidal  restoring  force  for  the  plane  pendulum  is  non-linear  for  large  angles  9.  The  natural 
period  of  the  free  pendulum  is 


CUq 


mgL 


A dimensionless  parameter  7,  which  is  called  the  drive  strength,  is  defined  by 

Fd 


7 = 


mg 


(4.29) 


(4.30) 


The  equation  of  motion  4.28  can  be  generalized  by  use  of  dimensionless  units  for  both  time  t and  relative 
drive  frequency  Co  defined  by 


t = coot 

In  addition,  define  the  inverse  damping  factor  Q as 


to  = 


CO 

COq 


Q=  — 

^ - b 

These  definitions  allow  equation  4.28  to  be  written  in  the  dimensionless  form 

d29  1 dO  . n 


(4.31) 

(4.32) 

(4.33) 


The  behavior  of  the  angle  9 for  the  driven  damped  plane  pendulum  depends  on  the  drive  strength  7 
and  the  damping  factor  Q.  This  driven  damped  plane  pendulum  is  evaluated  assuming  that  the  damping 
coefficient  Q = 2,  and  that  the  relative  angular  frequency  Co  = |,  which  is  close  to  resonance  where  chaotic 
phenomena  are  manifest.  The  Runge-Kutta  method  is  used  to  solve  this  non-linear  equation  of  motion. 


4.5.1  Close  to  linearity 

For  drive  strength  7 = 0.2  the  amplitude  is  sufficiently  small  that  sin  9 ~ 9,  superposition  applies,  and  the 
solution  is  identical  to  that  for  the  driven  linearly-damped  linear  oscillator.  As  shown  in  figure  4.5,  once 
the  transient  solution  dies  away,  the  steady-state  solution  asymptotically  approaches  one  attractor  that  has 
an  amplitude  of  ±0.3  radians  and  a phase  shift  S with  respect  to  the  driving  force.  The  abscissa  is  given 
in  units  of  the  dimensionless  time  t = coot.  The  transient  solution  depends  on  the  initial  conditions  and 
dies  away  after  about  5 periods,  whereas  the  steady-state  solution  is  independent  of  the  initial  conditions 
and  has  a state-space  diagram  that  has  an  elliptical  shape,  characteristic  of  the  harmonic  oscillator.  For  all 
initial  conditions,  the  time  dependence  and  state  space  diagram  for  steady-state  motion  approaches  a unique 
solution,  called  an  "attractor",  that  is,  the  pendulum  oscillates  sinusoidally  with  a given  amplitude  at  the 
frequency  of  the  driving  force  and  with  a constant  phase  shift  5,  i.e. 

9{t)  = Acos(cot  — (5).  (4.34) 

This  solution  is  identical  to  that  for  the  harmonically-driven,  linearly-damped,  linear  oscillator  discussed  in 
chapter  3.6. 


1 A similar  approach  is  used  by  the  book  "Chaotic  Dynamics  " by  Baker  and  Gollub[Bak96]. 
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Figure  4.5:  Motion  of  the  driven  damped  pendulum  for  drive  strengths  of  7 = 0.2,  7 = 0.9,  7 = 1.05,  and 
7 = 1.078.  The  left  side  shows  the  time  dependence  of  the  deflection  angle  9 with  the  time  axis  expressed 
in  dimensionless  units  t.  The  right  side  shows  the  corresponding  state-space  plots.  These  plots  assume 
uj  = — = |,Q  = 2,  and  the  motion  starts  with  9 = lu  = 0. 

CJ  0 O ' ^ ' 
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Figure  4.6:  The  driven  damped  pendulum  assuming  that  Ci)  = Q = 2,  with  initial  conditions  0(0)  = — f , 
w(0)  = 0.  The  system  exhibits  period-two  motion  for  drive  strengths  of  7 = 1.078  as  shown  by  the  state 
space  diagram  for  cycles  10  — 20.  For  7 = 1.081  the  system  exhibits  period- four  motion  shown  for  cycles 
10  - 30. 


4.5.2  Weak  nonlinearity 


Figure  4.5  shows  that  for  drive  strength  7 = 0.9,  after  the  transient  solution  dies  away,  the  steady-state 
solution  settles  down  to  one  attractor  that  oscillates  at  the  drive  frequency  with  an  amplitude  of  slightly 
more  than  f radians  for  which  the  small  angle  approximation  fails.  The  distortion  due  to  the  non-linearity 
is  exhibited  by  the  non-elliptical  shape  of  the  state-space  diagram. 

The  observed  behavior  can  be  calculated  using  the  successive  approximation  method  discussed  in  chapter 
4.2.  That  is,  close  to  small  angles  the  sine  function  can  be  approximated  by  replacing 


6 


in  equation  4.33  to  give 

d + ^e  + ^20 

As  a first  approximation  assume  that 


7 cos  Cot 


(4.35) 


9(t)  « Acos(Cot  — 6) 

then  the  small  </>3  term  in  equation  4.35  contributes  a term  proportional  to  cos3  (Cot  — 5).  But 

cos3  (Cot  — 8)  = — (cos  3 (Cot  — 5)  + 3 cos  (Cot  — 5)) 

That  is,  the  nonlinearity  introduces  a small  term  proportional  to  cos  3 (cot  — 8).  Since  the  right-hand  side  of 
equation  4.35  is  a function  of  only  coswf,  then  the  terms  in  9,9,  and  9 on  the  left  hand  side  must  contain 
the  third  harmonic  cos3(wf  — 5)  term.  Thus  a better  approximation  to  the  solution  is  of  the  form 

9(t)  = A [cos(wt  — (5)  + £ cos  3(Cot  — <5)] 


(4.36) 
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where  the  admixture  coefficient  e < 1.  This  successive  approximation  method  can  be  repeated  to  add 
additional  terms  proportional  to  cos  n(uit  — 6)  where  n is  an  integer  with  n > 3.  Thus  the  nonlinearity 
introduces  progressively  weaker  n-fold  harmonics  to  the  solution.  This  successive  approximation  approach 
is  viable  only  when  the  admixture  coefficient  e < 1.  Note  that  these  harmonics  are  integer  multiples  of  u), 
thus  the  steady-state  response  is  identical  for  each  full  period  even  though  the  state  space  contours  deviate 
from  an  elliptical  shape. 


4.5.3  Onset  of  complication 

Figure  4.5  shows  that  for  7 = 1.05  the  drive  strength  is  sufficiently  strong  to  cause  the  transient  solution  for 
the  pendulum  to  rotate  through  two  complete  cycles  before  settling  down  to  a single  steady-state  attractor 
solution  at  the  drive  frequency.  However,  this  attractor  solution  is  shifted  two  complete  rotations  relative 
to  the  initial  condition.  The  state  space  diagram  clearly  shows  the  rolling  motion  of  the  transient  solution 
for  the  first  two  periods  prior  to  the  system  settling  down  to  a single  steady-state  attractor.  The  successive 
approximation  approach  completely  fails  at  this  coupling  strength  since  0 oscillates  through  large  values  that 
are  multiples  of  7 r. 

Figure  4.5  shows  that  for  drive  strength  7 = f.078  the  motion  evolves  to  a much  more  complicated 
periodic  motion  with  a period  that  is  three  times  the  period  of  the  driving  force.  Moreover  the  amplitude 
exceeds  27T  corresponding  to  the  pendulum  oscillating  over  top  dead  center  with  the  centroid  of  the  motion 
offset  by  37t  from  the  initial  condition.  Both  the  state-space  diagram,  and  the  time  dependence  of  the  motion, 
illustrate  the  complexity  of  this  motion  which  depends  sensitively  on  the  magnitude  of  the  drive  strength  7, 
in  addition  to  the  initial  conditions,  (0(O),cu(O))  and  damping  factor  Q as  is  shown  in  figure  4.6 


4.5.4  Period  doubling  and  bifurcation 

For  drive  strength  7 = 1.078,  with  the  initial  condition  (0(0),  w(0))  = (0,0) , the  system  exhibits  a regular 
motion  with  a period  that  is  three  times  the  drive  period.  In  contrast,  if  the  initial  condition  is  [0(0)  = 
— |,ca(0)  = 0]  then,  as  shown  in  figure  4.6,  the  steady-state  solution  has  the  drive  frequency  with  no  offset 
in  0,  that  is,  it  exhibits  period-one  oscillation.  This  appearance  of  two  separate  and  very  different  attractors 
for  7 = 1.078,  using  different  initial  conditions,  is  called  bifurcation. 

An  additional  feature  of  the  system  response  for  7 = 1.078  is  that  changing  the  initial  conditions  to 
[0(0)  = ^|,ca(0)  = 0]  shows  that  the  amplitude  of  the  even  and  odd  periods  of  oscillation  differ  slightly 
in  shape  and  amplitude,  that  is,  the  system  really  has  period-two  oscillation.  This  period-two  motion,  i.e. 
period  doubling,  is  clearly  illustrated  by  the  state  space  diagram  in  that,  although  the  motion  still  is 
dominated  by  period-one  oscillations,  the  even  and  odd  cycles  are  slightly  displaced.  Thus,  for  different 
initial  conditions,  the  system  for  7 = 1.078  bifurcates  into  either  of  two  attractors  that  have  very  different 
waveforms,  one  of  which  exhibits  period  doubling. 

The  period  doubling  exhibited  for  7 = 1.078,  is  followed  by  a second  period  doubling  when  7 = 1.081  as 
shown  in  figure  4.6  . With  increase  in  drive  strength  this  period  doubling  keeps  increasing  in  binary  multiples 
to  period  8,  16,  32,  64  etc.  Numerically  it  is  found  that  the  threshold  for  period  doubling  is  7j  = 1.0663, 
from  two  to  four  occurs  at  72  = 1.0793  etc.  Feigenbaum  showed  that  this  cascade  increases  with  increase  in 
drive  strength  according  to  the  relation  that  obeys 


(OA+l  Tn)  — ^ (OA  In—  l) 


(4.37) 


where  S = 4.6692016,  6 is  called  a Feigenbaum  number.  As  n — > 00  this  cascading  sequence  goes  to  a limit 
7C  where 

7 c = 1.0829  (4.38) 


4.5.5  Rolling  motion 

It  was  shown  that  for  7 > 1.05  the  transient  solution  causes  the  pendulum  to  have  angle  excursions  exceeding 
27t,  that  is,  the  system  rolls  over  top  dead  center.  For  drive  strengths  in  the  range  1.3  < 7 < 1.4,  the  steady- 
state  solution  for  the  system  undergoes  continuous  rolling  motion  as  illustrated  in  figure  4.7.  The  time 
dependence  for  the  angle  exhibits  a periodic  oscillatory  motion  superimposed  upon  a monotonic  rolling 
motion,  whereas  the  time  dependence  of  the  angular  frequency  to  = is  periodic.  The  state  space  plots 
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Figure  4.7:  Rolling  motion  for  the  driven  damped  plane  pendulum  for  7 = 1.4.  (a)  The  time  dependence 
of  angle  9(t)  increases  by  2n  per  drive  period  whereas  (b)  the  angular  velocity  w(t)  exhibits  periodicity,  (c) 
The  state  space  plot  for  rolling  motion  is  shown  with  the  origin  shifted  by  2n  per  revolution  to  keep  the  plot 
within  the  bounds  —7 r < 9 < +7r 


for  rolling  motion  corresponds  to  a chain  of  loops  with  a spacing  of  27r  between  each  loop.  The  state  space 
diagram  for  rolling  motion  is  more  compactly  presented  if  the  origin  is  shifted  by  2n  per  revolution  to  keep 
the  plot  within  bounds  as  illustrated  in  figure  4.7c. 

4.5.6  Onset  of  chaos 

When  the  drive  strength  is  increased  to  7 = 1.105,  then  the  system  does  not  approach  a unique  attractor 
as  illustrated  by  figure  4.8 left  which  shows  state  space  orbits  for  cycles  25  — 200.  Note  that  these  orbits  do 
not  repeat  implying  the  onset  of  chaos.  For  drive  strengths  greater  than  7C  = 1.0829  the  driven  damped 
plane  pendulum  starts  to  exhibit  chaotic  behavior.  The  onset  of  chaotic  motion  is  illustrated  by  making  a 3- 
dimensional  plot  which  combines  the  time  coordinate  with  the  state-space  coordinates  as  illustrated  in  figure 
4.8 right.  This  plot  shows  16  trajectories  starting  at  different  initial  values  in  the  range  —0.15  < 9 < 0.15 
for  7 = 1.168.  Some  solutions  are  erratic  in  that,  while  trying  to  oscillate  at  the  drive  frequency,  they  never 
settle  down  to  a steady  periodic  motion  which  is  characteristic  of  chaotic  motion.  Figure  4.8 right  illustrates 
the  considerable  sensitivity  of  the  motion  to  the  initial  conditions.  That  is,  this  deterministic  system  can 
exhibit  either  order,  or  chaos,  dependent  on  miniscule  differences  in  initial  conditions. 


CO1'**/*) 


Figure  4.8:  Left:  Space-space  orbits  for  the  driven  damped  pendulum  with  7 = 1.105.  Note  that  the  orbits 
do  not  repeat  for  cycles  25  to  200.  Right:  Time-state-space  diagram  for  7 = 1.168.  The  plot  shows  16 
trajectories  starting  with  different  initial  values  in  the  range  —0.15  < 9 < 0.15. 
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Figure  4.9:  State-space  plots  for  the  harmonically-driven,  linearly-damped,  pendulum  for  driving  amplitudes 
of  Fd  = 0.5  and  Fd  = 1.2.  These  calculations  were  performed  using  the  Runge-Kutta  method  by  E.  Shah, 
(Private  communication) 


4.6  Differentiation  between  ordered  and  chaotic  motion 

Chapter  4.5  showed  that  motion  in  non-linear  systems  can  exhibit  both  order  and  chaos.  The  transition 
between  ordered  motion  and  chaotic  motion  depends  sensitively  on  both  the  initial  conditions  and  the  model 
parameters.  It  is  surprisingly  difficult  to  unambiguously  distinguish  between  complicated  ordered  motion 
and  chaotic  motion.  Moreover,  the  motion  can  fluctuate  between  order  and  chaos  in  an  erratic  manner 
depending  on  the  initial  conditions.  The  extremely  sensitivity  to  initial  conditions  of  the  motion  for  non- 
linear systems,  makes  it  essential  to  have  quantitative  measures  that  can  characterize  the  degree  of  order,  and 
interpret  the  complicated  dynamical  motion  of  systems.  As  an  illustration,  consider  the  harmonically-driven, 
linearly-damped,  pendulum  with  Q = 2,  and  driving  force  F(t)  = Fd  sinu)f  where  Oj  = |.  Figure  4.9  shows 
the  state-space  plots  for  two  driving  amplitudes,  Fd  = 0.5  which  leads  to  ordered  motion,  and  Fd  = 1.2 
which  leads  to  possible  chaotic  motion.  It  can  be  seen  that  for  Fd  = 0.5  the  state-space  diagram  converges 
to  a single  attractor  once  the  transient  solution  has  died  away.  This  is  in  contrast  to  the  case  for  Fd  = 1.2, 
where  the  state-space  diagram  does  not  converge  to  a single  attractor,  but  exhibits  possible  chaotic  motion. 
Three  quantitative  measures  can  be  used  to  differentiate  ordered  motion  from  chaotic  motion  for  this  system, 
namely,  the  Lyapunov  exponent,  the  bifurcation  diagram,  and  the  Poincare  section,  as  illustrated  below. 


4.6.1  Lyapunov  exponent 

The  Lyapunov  exponent  provides  a quantitative  and  useful  measure  of  the  instability  of  trajectories,  and  how 
quickly  nearby  initial  conditions  diverge.  It  compares  two  identical  systems  that  start  with  an  infinitesimally 
small  difference  in  the  initial  conditions  in  order  to  ascertain  whether  they  converge  to  the  same  attractor 
at  long  times,  corresponding  to  a stable  system,  or  whether  they  diverge  to  very  different  attractors,  charac- 
teristic of  chaotic  motion.  If  the  initial  separation  between  the  trajectories  in  phase  space  at  t = 0 is  \8Z$\, 
then  to  first  order  the  time  dependence  of  the  difference  can  be  assumed  to  depend  exponentially  on  time. 
That  is, 

\5Z(t)\~ext\Z0\  (4.39) 

where  A is  the  Lyapunov  exponent.  That  is,  the  Lyapunov  exponent  is  defined  to  be 


A = lim  lim  - In 

t — kx)  8 Z0 — >0  t 


\sm\ 

\Zo\ 


(4.40) 


Systems  for  which  the  Lyapunov  exponent  A < 0,  (negative)  converge  exponentially  to  the  same  attractor 
solution  at  long  times  since  \8Z(t)\  — > 0 for  t — > oo.  By  contrast,  systems  for  which  A > 0 (positive)  diverge 
to  completely  different  long-time  solutions,  that  is,  \8Z(t)\  — > oo  for  t — > oo.  Even  for  infinitesimally 
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Atfvs.  Time,  Fn  — 0.5 


Figure  4.10:  Lyapunov  plots  of  Ad  versus  time  for  two  initial  starting  points  differing  by  AOg  = 0.001  rads. 
The  parameters  are  Q = 2,  and  F(t)  = Fn  and  At  = 0.04s.  The  Lyapunov  exponent  for  Fn  = 0.5 

which  is  drawn  as  a dashed  line,  is  convergent  with  A = —0.251.  For  Fn  = 1.2  the  exponent  is  divergent  as 
indicated  by  the  dashed  line  which  as  a slope  of  A = 0.1538.  These  calculations  were  performed  using  the 
Runge-Kutta  method  by  E.  Shah,  (Private  communication) 


small  differences  in  the  initial  conditions,  systems  having  a positive  Lyapunov  exponent  diverge  to  different 
attractors,  whereas  when  the  Lyapunov  exponent  A < 0 they  correspond  to  stable  solutions. 

Figure  4.10  illustrates  Lyapunov  plots  for  the  harmonically-driven,  linearly-damped,  plane  pendulum, 
with  the  same  conditions  discussed  in  chapter  4.5.  Note  that  for  the  small  driving  amplitude  Fn  = 0.5, 
the  Lyapunov  plot  converges  to  ordered  motion  with  an  exponent  A = —0.251,  whereas  for  FD  = 1.2,  the 
plot  diverges  characteristic  of  chaotic  motion  with  an  exponent  A = 0.1538.  The  Lyapunov  exponent  usually 
fluctuates  widely  at  the  local  oscillator  frequency,  and  thus  the  time  average  of  the  Lyapunov  exponent  must 
be  taken  over  many  periods  of  the  oscillation  to  identify  the  general  trend  with  time.  Some  systems  near  an 
orcler-to-chaos  transition  can  exhibit  positive  Lyapunov  exponents  for  short  times,  characteristic  of  chaos, 
and  then  converge  to  negative  A at  longer  time  implying  ordered  motion.  The  Lyapunov  exponents  are 
used  extensively  to  monitor  the  stability  of  the  solutions  for  non-linear  systems.  For  example  the  Lyapunov 
exponent  is  used  to  identify  whether  fluid  flow  is  laminar  or  turbulent  as  discussed  in  chapter  15.8. 

A dynamical  system  in  n-dimensional  phase  space  will  have  a set  of  n Lyapunov  exponents  {Ai,  A2,  A„} 

associated  with  a set  of  attractors,  the  importance  of  which  depend  on  the  initial  conditions.  Typically  one 
Lyapunov  exponent  dominates  at  one  specific  location  in  phase  space,  and  thus  it  is  usual  to  use  the  maximal 
Lyapunov  exponent  to  identify  chaos. The  Lyapunov  exponent  is  a very  sensitive  measure  of  the  onset  of  chaos 
and  provides  an  important  test  of  the  chaotic  nature  for  the  complicated  motion  exhibited  by  non-linear 
systems. 

4.6.2  Bifurcation  diagram 

The  bifurcation  diagram  simplifies  the  presentation  of  the  dynamical  motion  by  sampling  the  status  of 
the  system  once  per  period,  synchronized  to  the  driving  frequency,  for  many  sets  of  initial  conditions.  The 
results  are  presented  graphically  as  a function  of  one  parameter  of  the  system  in  the  bifurcation  diagram.  For 
example,  the  wildly  different  behavior  in  the  driven  damped  plane  pendulum  is  represented  on  a bifurcation 
diagram  in  figure  4.11,  which  shows  the  observed  angular  velocity  cu  of  the  pendulum  sampled  once  per  drive 
cycle  plotted  versus  drive  strength.  The  bifurcation  diagram  is  obtained  by  sampling  either  the  angle  0, 
or  angular  velocity  w,  once  per  drive  cycle,  that  is,  it  represents  the  observables  of  the  pendulum  using  a 
stroboscopic  technique  that  samples  the  motion  synchronous  with  the  drive  frequency.  Bifurcation  plots  also 
can  be  created  as  a function  of  either  the  time  t,  the  damping  factor  Q , the  normalized  frequency  Cj  = 
or  the  driving  amplitude  7. 
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In  the  domain  with  drive  strength  7 < 
1.0663  there  is  one  unique  angle  each  drive 
cycle  as  illustrated  by  the  bifurcation  di- 
agram. For  slightly  higher  drive  strength 
period-two  bifurcation  behavior  results  in 
two  different  angles  per  drive  cycle.  The 
Lyapunov  exponent  is  negative  for  this  re- 
gion corresponding  to  ordered  motion.  The 
cascade  of  period  doubling  with  increase  in 
drive  strength  is  readily  apparent  until  chaos 
sets  in  at  the  critical  drive  strength  7C  when 
there  is  a random  distribution  of  sampled  an- 
gular velocities  and  the  Lyapunov  exponent 
becomes  positive.  Note  that  at  7 = 1.0845 
there  is  a brief  interval  of  period-6  motion 
followed  by  another  region  of  chaos.  Around 
7 = 1.1  there  is  a region  that  is  primarily 
chaotic  which  is  reflected  by  chaotic  values  of 
the  angular  velocity  on  the  bifurcation  plot 
and  large  positive  values  of  the  Lyapunov  ex- 
ponent. The  region  around  7 = 1.12  exhibits 
period  three  motion  and  negative  Lyapunov 
exponent  corresponding  to  ordered  motion. 
The  1.15  < 7 < 1.25  region  is  mainly  chaotic 
and  has  a large  positive  Lyapunov  exponent. 
The  region  with  1.3  < 7 < 1.4  is  striking 
in  that  this  corresponds  to  rolling  motion 
with  reemergence  of  period  one  and  negative 
Lyapunov  exponent.  This  period-1  motion 


X 
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Figure  4.11:  Bifurcation  diagram  samples  the  angular  velocity 
to  once  per  period  for  the  driven,  linearly-damped,  plane  pen- 
dulum plotted  as  a function  of  the  drive  strength  7.  Regions 
of  period  doubling,  and  chaos,  as  well  as  islands  of  stability 
all  are  manifest  as  the  drive  strength  7 is  changed.  Note  that 
the  limited  number  of  samples  causes  broadening  of  the  lines 
adjacent  to  bifurcations. 


is  due  to  a continuous  rolling  motion  of  the 
plane  pendulum  as  shown  in  figure  4.7  where  it  is  seen  that  the  average  9 increases  27r  per  cycle,  whereas  the 
angular  velocity  00  exhibits  a periodic  motion.  That  is,  on  average  the  pendulum  is  rotating  2n  per  cycle. 
Above  7 = 1.4  the  system  start  to  exhibit  period  doubling  followed  by  chaos  reminiscent  of  the  behavior 
seen  at  lower  7 values. 

These  results  show  that  the  bifurcation  diagram  nicely  illustrates  the  order  to  chaos  transitions  for  the 
harmonically-driven,  linearly-damped,  pendulum.  Several  transitions  between  order  and  chaos  are  seen  to 
occur.  The  apparent  ordered  and  chaotic  regimes  are  confirmed  by  the  corresponding  Lyapunov  exponents 
which  alternate  between  negative  and  positive  values  for  the  ordered  and  chaotic  regions  respectively. 


4.6.3  Poincare  Section 

State-space  plots  are  very  useful  for  characterizing  periodic  motion,  but  they  become  too  dense  for  useful 
interpretation  when  the  system  approaches  chaos  as  illustrated  in  figure  4.11.  Poincare  sections  solve  this 
difficulty  by  taking  a stroboscopic  sample  once  per  cycle  of  the  state-space  diagram.  That  is,  the  point  on 
the  state  space  orbit  is  sampled  once  per  drive  frequency.  For  period- 1 motion  this  corresponds  to  a single 
point  {9,uo).  For  period-2  motion  this  corresponds  to  two  points  etc.  For  chaotic  systems  the  sequence  of 
state-space  sample  points  follow  complicated  trajectories.  Figure  4.12  shows  the  Poincare  sections  for  the 
corresponding  state  space  diagram  shown  in  figure  4.9  for  cycles  10  to  6000.  Note  the  complicated  curves  do 
not  cross  or  repeat.  Enlargements  of  any  part  of  this  plot  will  show  increasingly  dense  parallel  trajectories, 
called  fractals,  that  indicates  the  complexity  of  the  chaotic  cyclic  motion.  That  is,  zooming  in  on  a small 
section  of  this  Poincare  plot  shows  many  closely  parallel  trajectories.  The  fractal  attractors  are  surprisingly 
robust  to  large  differences  in  initial  conditions.  Poincare  sections  are  a sensitive  probe  of  periodic  motion 
for  systems  where  periodic  motion  is  not  readily  apparent. 

In  summary,  the  behavior  of  the  well-known,  harmonically-driven,  linearly-damped,  plane  pendulum 
becomes  remarkably  complicated  at  large  driving  amplitudes  where  non-linear  effects  dominate.  That  is, 
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Figure  4.12:  Three  Poincare  section  plots  for  the  harmonically-driven,  linearly-damped,  pendulum  for  various 
initial  conditions  with  FD  = 1.2,  u = and  At  = These  calculations  used  the  Runge-Kutta  method 
and  were  performed  for  GOOOcydes  by  E.  Shah  (Private  communication). 


when  the  restoring  force  is  non-linear.  The  system  exhibits  bifurcation  where  it  can  evolve  to  multiple 
attractors  that  depend  sensitively  on  the  initial  conditions.  The  system  exhibits  both  oscillatory,  and  rolling, 
solutions  depending  on  the  amplitude  of  the  motion.  The  system  exhibits  domains  of  simple  ordered  motion 
separated  by  domains  of  very  complicated  ordered  motion  as  well  as  chaotic  regions.  The  transitions  between 
these  dramatically  different  modes  of  motion  are  extremely  sensitive  to  the  amplitude  and  phase  of  the 
driver.  Eventually  the  motion  becomes  completely  chaotic.  The  Lyapunov  exponent,  bifurcation  diagram, 
and  Poincare  section  plots,  are  sensitive  measures  of  the  order  of  the  motion.  These  three  sensitive  measures 
of  order  and  chaos  are  used  extensively  in  many  fields  in  classical  mechanics.  Considerable  computing 
capabilities  are  required  to  elucidate  the  complicated  motion  involved  in  non-linear  systems.  Examples 
include  laminar  and  turbulent  flow  in  fluid  dynamics  and  weather  forecasting  of  hurricanes,  where  the 
motion  can  span  a wide  dynamic  range  in  dimensions  from  10“°  to  104m. 

4.7  Wave  propagation  for  non-linear  systems 

4.7.1  Phase,  group,  and  signal  velocities 

Chapter  3 discussed  the  wave  equation  and  solutions  for  linear  systems.  It  was  shown  that,  for  linear  systems, 
the  wave  motion  obeys  superposition  and  exhibits  dispersion,  that  is,  a frequency-dependent  phase  velocity, 
and,  in  some  cases,  attenuation.  Nonlinear  systems  introduce  intriguing  new  wave  phenomena.  For  example 
for  nonlinear  systems,  second,  and  higher  terms  must  be  included  in  the  Taylor  expansion  given  in  equation 
4.2.  These  second  and  higher  order  terms  result  in  the  group  velocity  being  a function  of  u>,  that  is,  group 
velocity  dispersion  occurs  which  leads  to  the  shape  of  the  envelope  of  the  wave  packet  being  time  dependent. 
As  a consequence  the  group  velocity  in  the  wave  packet  is  not  well  defined,  and  does  not  equal  the  signal 
velocity  of  the  wave  packet  or  the  phase  velocity  of  the  wavelets.  Nonlinear  optical  systems  have  been  studied 
experimentally  where  vgroup  « c,  which  is  called  slow  light,  while  other  systems  have  vgroup  > c which  is 
called  superluminal  light.  The  ability  to  control  the  velocity  of  light  in  such  optical  systems  is  of  considerable 
current  interest  since  it  has  signal  transmission  applications. 

The  dispersion  relation  for  a nonlinear  system  can  be  expressed  as  a Taylor  expansion  of  the  form 

k = 4 + (£)„.„ i"  ■ “o) + ^ - "°)2 + ■■  <t4i) 

where  ui  is  used  as  the  independent  variable  since  it  is  invariant  to  phase  transitions  of  the  system.  Note 
that  the  factor  for  the  first  derivative  term  is  the  reciprocal  of  the  group  velocity 
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while  the  factor  for  the  second  derivative  term  is 
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which  gives  the  velocity  dispersion  for  the  system. 
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The  inverse  velocities  for  electromagnetic  waves  are  best  represented  in  terms  of  the  corresponding  refractive 
indices  n,  where 
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and  the  group  refractive  index 
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Then  equation  4.47  can  be  written  in  the  more  convenient  form 
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Wave  propagation  for  an  optical  system  that 
is  subject  to  a single  resonance  gives  one  ex- 
ample of  nonlinear  frequency  response  that  has 
applications  to  optics. 

Figure  4.13  shows  that  the  real  and  imagi- 
nary parts  of  the  phase  refractive  index  exhibit 
the  characteristic  resonance  frequency  depen- 
dence of  the  sinusoidally-driven,  linear  oscillator 
that  was  discussed  in  chapter  3.6  and  as  illus- 
trated in  figure  3.10.  Figure  4.13  also  shows  the 
group  refractive  index  ngroup  computed  using 
equation  4.48. 

Note  that  at  resonance  ngroup  is  reduced  be- 
low the  non-resonant  value  which  corresponds 
to  superluminal  (fast)  light,  whereas  in  the 
wings  of  the  resonance  ngroup  is  larger  than  the 
non-resonant  value  corresponding  to  slow  light. 
Thus  the  nonlinear  dependence  of  the  refractive 
index  n on  angular  frequency  w leads  to  fast 
or  slow  group  velocities  for  isolated  wave  pack- 
ets. Velocities  of  light  as  slow  as  17m/  sec  have 
been  observed.  Experimentally  the  energy  ab- 
sorption that  occurs  on  resonance  makes  it  dif- 
ficult to  observe  the  superluminal  electromag- 
netic wave  at  resonance. 

Note  that  Sommerfelcl  and  Brillouin  showed 
that  even  though  the  group  velocity  may  exceed 
c,  the  signal  velocity,  marking  the  arrival  of  the 
leading  edge  of  the  optical  pulse,  does  not  ex- 
ceed c,  the  velocity  of  light  in  vacuum,  as  was 
postulated  by  Einstein.  [Bril4] 


Figure  4.13:  The  real  and  imaginary  parts  of  the  phase 
refractive  index  n plus  the  real  part  of  the  group  refractive 
index  associated  with  an  isolated  atomic  resonance. 
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4.7.2  Soliton  wave  propagation 

The  soliton  is  a fascinating  and  very  special 
wave  propagation  phenomenon  that  occurs  for 
certain  non-linear  systems.  The  soliton  is  a self- 
reinforcing solitary  localized  wave  packet  that 
maintains  its  shape  while  travelling  long  distances 
at  a constant  speed.  Solitons  are  caused  by  a 
cancellation  of  phase  modulation  resulting  from 
non-linear  velocity  dependence,  and  the  group  ve- 
locity dispersive  effects  in  a medium.  Solitons 
arise  as  solutions  of  a widespread  class  of  weakly- 
nonlinear  dispersive  partial  differential  equations 
describing  many  physical  systems.  Figure  4.14 
shows  a soliton  comprising  a solitary  water  wave 
approaching  the  coast  of  Hawaii.  While  the  soli- 
ton in  Fig.  4.14  may  appear  like  a normal  wave, 
it  is  unique  in  that  there  are  no  other  waves  ac- 
companying it.  This  wave  was  probably  created 
far  away  from  the  shore  when  a normal  wave  was  Figure  4.14:  A solitary  wave  approaches  the  coast  of  Hawaii, 
modulated  by  a geometrical  change  in  the  ocean  (lmage:  Robert  Odom/University  of  Washington) 
depth,  such  as  the  rising  sea  floor,  which  forced 
it  into  the  appropriate  shape  for  a soliton.  The 
wave  was  then  able  to  travel  to  the  coast  intact, 

despite  the  apparently  placid  nature  of  the  ocean  near  the  beach.  Solitons  are  notable  in  that  they  interact 
with  each  other  in  ways  very  different  from  normal  waves.  Normal  waves  are  known  for  their  complicated 
interference  patterns  that  depend  on  the  frequency  and  wavelength  of  the  waves.  Solitons,  can  pass  right 
through  each  other  without  being  a affected  at  all.  This  makes  solitons  very  appealing  to  scientists  because 
soliton  waves  are  more  sturdy  than  normal  waves  and  can  therefore  be  used  to  transmit  information  in  ways 
that  are  distinctly  different  than  for  normal  wave  motion.  For  example,  optical  solitons  are  used  in  optical 
fibers  made  of  a dispersive,  nonlinear  optical  medium,  to  transmit  optical  pulses  with  an  invariant  shape. 

Solitons  were  first  observed  in  1834  by  John  Scott  Russell  (1808  — 1882).  Russell  was  an  engineer  con- 
ducting experiments  to  increase  the  efficiency  of  canal  boats.  His  experimental  and  theoretical  investigations 
allowed  him  to  recreate  the  phenomenon  in  wave  tanks  that  he  built  in  his  home.  Through  his  extensive 
studies,  Scott  Russell  noticed  that  soliton  propagation  exhibited  the  following  properties: 

• The  waves  are  stable  and  hold  their  shape  for  long  periods  of  time. 

• The  waves  can  travel  over  long  distances  at  uniform  speed. 

• The  speed  of  propagation  of  the  wave  depends  on  the  size  of  the  wave,  with  larger  waves  traveling 
faster  than  smaller  waves. 

• The  waves  maintained  their  shape  when  they  collided  - seemingly  passing  right  through  each  other. 

Scott  Russell’s  work  was  met  with  scepticism  by  the  scientific  community.  The  problem  with  the  Wave 

of  Translation  was  that  it  was  an  effect  that  depended  on  nonlinear  effects,  whereas  previously  existing 
theories  of  hydrodynamics  (such  as  those  of  Newton  and  Bernoulli)  only  dealt  with  linear  systems.  George 
Biddell  Airy,  and  George  Gabriel  Stokes,  published  papers  attacking  Scott  Russell’s  observations  because 
the  observations  could  not  be  explained  by  their  theories  of  wave  propagation  in  water.  Regardless,  Scott 
Russell  was  convinced  of  the  prime  importance  of  the  Wave  of  Translation  and  history  proved  that  he  was 
correct.  Scott  Russell  went  on  to  develop  the  "wave  line"  system  of  hull  construction  that  revolutionized 
nineteenth  century  naval  architecture,  along  with  a number  of  other  great  accomplishments  that  rewarded 
him  with  much  fame  and  prominence.  Despite  all  of  the  success  in  his  career,  he  continued  throughout  his 
life  to  pursue  his  studies  of  the  Wave  of  Translation. 

In  1895  Korteweg  and  de  Vries  developed  a wave  equation  for  surface  waves  for  shallow  water. 


0 


(4.49) 


d0  d3(, b ,R,d(j) 

A solution  of  this  equation  has  the  characteristics  of  a solitary  wave  with  fixed  shape.  It  is  given  by 
substituting  the  form  = f(x  — vt)  into  the  Korteweg-de  Vries  equation  which  gives 
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Of 

vdx 


d3f 

dx3 


+6/& 


= 0 


Integrating  with  respect  to  x gives 

3>2  + §-c>  = c 

where  C is  a constant  of  integration.  This  non-linear  equation  has  a solution 


(j>{x,t)  = -csec/i2 


2~(x  ~vt-  a) 


(4.50) 

(4.51) 


(4.52) 


where  a is  a constant.  Equation  4.52  is  the  equation  of  a solitary  wave  moving  in  the  +x  direction  at  a 
velocity  v. 

Soliton  behavior  is  observed  in  phenomena  such  as  tsunamis,  tidal  bores  that  occur  for  some  rivers, 
signals  in  optical  fibres,  plasmas,  atmospheric  waves,  vortex  filaments,  superconductivity,  and  gravitational 
fields  having  cylindrical  symmetry.  Much  work  has  been  done  on  solitons  for  fibre  optics  applications.  The 
soliton’s  inherent  stability  make  long-distance  transmission  possible  without  the  use  of  repeaters,  and  could 
potentially  double  the  transmission  capacity. 

Before  the  discovery  of  solitons,  mathematicians  were  under  the  impression  that  nonlinear  partial  differ- 
ential equations  could  not  be  solved  exactly.  However,  solitons  led  to  the  recognition  that  there  are  non-linear 
systems  that  can  be  solved  analytically.  This  discovery  has  prompted  much  investigation  into  these  so-called 
"integrable  systems."  Such  systems  are  rare,  as  most  non-linear  differential  equations  admit  chaotic  behavior 
with  no  explicit  solutions.  Integrable  systems  nevertheless  lead  to  very  interesting  mathematics  ranging  from 
differential  geometry  and  complex  analysis  to  quantum  field  theory  and  fluid  dynamics. 

Many  of  the  fundamental  equations  in  physics  (Maxwell’s,  Schrodinger’s)  are  linear  equations.  However, 
physicists  have  begun  to  recognize  many  areas  of  physics  in  which  nonlinearity  can  result  in  qualitatively 
new  phenomenon  which  cannot  be  constructed  via  perturbation  theory  starting  from  linearized  equations. 
These  include  phenomena  in  magnetohydrodynamics,  meteorology,  oceanography,  condensed  matter  physics, 
nonlinear  optics,  and  elementary  particle  physics.  For  example,  the  European  space  mission  Cluster  detected 
a soliton-like  electrical  disturbances  that  travelled  through  the  ionized  gas  surrounding  the  Earth  starting 
about  50,000  kilometers  from  Earth  and  travelling  towards  the  planet  at  about  8 krn/s.  It  is  thought  that 
this  soliton  was  generated  by  turbulence  in  the  magnetosphere. 

Efforts  to  understand  the  nonlinearity  of  solitons  has  led  to  much  research  in  many  areas  of  physics.  In 
the  context  of  solitons,  their  particle-like  behavior  (in  that  they  are  localized  and  preserved  under  collisions) 
leads  to  a number  of  experimental  and  theoretical  applications.  The  technique  known  as  bosonization  allows 
viewing  particles,  such  as  electrons  and  positrons,  as  solitons  in  appropriate  field  equations.  There  are 
numerous  macroscopic  phenomena,  such  as  internal  waves  on  the  ocean,  spontaneous  transparency,  and  the 
behavior  of  light  in  fiber  optic  cable,  that  are  now  understood  in  terms  of  solitons.  These  phenomena  are 
being  applied  to  modern  technology. 


4.8  Summary 

The  study  of  the  dynamics  of  non-linear  systems  remains  a vibrant  and  rapidly  evolving  field  in  classical 
mechanics  as  well  as  many  other  branches  of  science.  This  chapter  has  discussed  examples  of  non-linear 
systems  in  classical  mechanics.  It  was  shown  that  the  superposition  principle  is  broken  even  for  weak 
nonlinearity.  It  was  shown  that  increased  nonlinearity  leads  to  bifurcation,  point  attractors,  limit-cycle 
attractors,  and  sensitivity  to  initial  conditions. 

Limit-cycle  attractors:  The  Poincare-Bendixson  theorem  for  limit  cycle  attractors  states  that  the 
paths,  both  in  state-space  and  phase-space,  can  have  three  possible  paths: 

(1)  closed  paths,  like  the  elliptical  paths  for  the  undamped  harmonic  oscillator, 

(2)  terminate  at  an  equilibrium  point  as  t — > oo,  like  the  point  attractor  for  a damped  harmonic  oscillator, 

(3)  tend  to  a limit  cycle  as  t — * oo. 

The  limit  cycle  is  unusual  in  that  the  periodic  motion  tends  asymptotically  to  the  limit-cycle  attractor 
independent  of  whether  the  initial  values  are  inside  or  outside  the  limit  cycle.  The  balance  of  dissipative  forces 
and  driving  forces  often  leads  to  limit-cycle  attractors,  especially  in  biological  applications.  Identification  of 
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limit-cycle  attractors,  as  well  as  the  trajectories  of  the  motion  towards  these  limit-cycle  attractors,  is  more 
complicated  than  for  point  attractors. 

The  van  der  Pol  oscillator  is  a common  example  of  a limit-cycle  system  that  has  an  equation  of  motion 
of  the  form 

d2x  / o _ , dx  9 

-l)-  + „ox  = 0 (4.11) 

The  van  der  Pol  oscillator  has  a limit-cycle  attractor  that  includes  non-linear  damping  and  exhibits 
periodic  solutions  that  asymptotically  approach  one  attractor  solution  independent  of  the  initial  conditions. 
There  are  many  examples  in  nature  that  exhibit  similar  behavior. 

Harmonically-driven,  linearly-damped,  plane  pendulum:  The  non-linearity  of  the  well-known 
driven  linearly-damped  plane  pendulum  was  used  as  an  excellent  example  of  the  behavior  of  non-linear 
systems  in  nature.  It  was  shown  that  non-linearity  leads  to  discontinuous  period  bifurcation,  extreme 
sensitivity  to  initial  conditions,  rolling  motion  and  chaos. 

Differentiation  between  ordered  and  chaotic  motion:  Lyapunov  exponents,  bifurcation  diagrams, 
and  Poincare  sections  were  used  to  identify  the  transition  from  order  to  chaos.  Chapter  15.8  discusses 
the  non-linear  Navier-Stokes  equations  of  viscous-fluid  flow  which  leads  to  complicated  transitions  between 
laminar  and  turbulent  flow.  Fluid  flow  exhibits  remarkable  complexity  that  nicely  illustrates  the  dominant 
role  that  non-linearity  can  have  on  the  solutions  of  practical  non-linear  systems  in  classical  mechanics. 

Wave  propagation  for  non-linear  systems:  Non-linear  equations  can  lead  to  unexpected  behavior 
for  wave  packet  propagation  such  as  fast  or  slow  light  as  well  as  soliton  solutions.  Moreover,  it  is  notable 
that  some  non-linear  systems  can  lead  to  analytic  solutions. 

The  complicated  phenomena  exhibited  by  the  above  non-linear  systems  is  not  restricted  to  classical 
mechanics,  rather  it  is  a manifestation  of  the  mathematical  behavior  of  the  solutions  of  the  differential 
equations  involved.  That  is,  this  behavior  is  a general  manifestation  of  the  behavior  of  solutions  for  second- 
order  differential  equations.  Exploration  of  this  complex  motion  has  only  become  feasible  with  the  advent 
of  powerful  computer  facilities  during  the  past  three  decades.  The  breadth  of  phenomena  exhibited  by 
these  examples  is  manifest  in  myriads  of  other  nonlinear  systems,  ranging  from  many-body  motion,  weather 
patterns,  growth  of  biological  species,  epidemics,  motion  of  electrons  in  atoms,  etc.  Other  examples  of  non- 
linear equations  of  motion  not  discussed  here,  are  the  three-body  problem,  which  is  mentioned  in  chapter  9, 
and  turbulence  in  fluid  flow  which  is  discussed  in  chapter  15. 

It  is  stressed  that  the  behavior  discussed  in  this  chapter  is  very  different  from  the  random  walk  problem 
which  is  a stochastic  process  where  each  step  is  purely  random  and  not  deterministic.  This  chapter  has 
assumed  that  the  motion  is  fully  deterministic  and  rigorously  follows  the  laws  of  classical  mechanics.  Even 
though  the  motion  is  fully  deterministic,  and  follows  the  laws  of  classical  mechanics,  the  motion  is  extremely 
sensitive  to  the  initial  conditions  and  the  non-linearities  can  lead  to  chaos.  Computer  modelling  is  the  only 
viable  approach  for  predicting  the  behavior  of  such  non-linear  systems.  The  complexity  of  solving  non-linear 
equations  is  the  reason  that  this  book  will  continue  to  consider  only  linear  systems.  Fortunately,  in  nature, 
non-linear  systems  can  be  approximately  linear  when  the  small-amplitude  assumption  is  applicable. 
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Workshop  exercises 

1.  Consider  the  chaotic  motion  of  the  driven  damped  pendulum  whose  equation  of  motion  is  given  by 

<j>  + F</>  + Wg  sin  cj)  = 7Cl>q  cos  ut 

for  which  the  Lyapunov  exponent  is  A = 1 with  time  measured  in  units  of  the  drive  period. 

(a)  Assume  that  you  need  to  predict  (j)  ( t ) with  an  accuracy  of  10~2rad*ans,  and  that  you  know  the  initial 
value  (j)  (0)  to  within  10 ~Gradians.  What  is  the  maximum  time  horizon  tmax  for  which  you  can  predict 
<f>  ( t ) within  the  required  accuracy? 

(b)  Suppose  that,  with  unlimited  time  and  financial  constraints,  you  manage  to  improve  the  accuracy  of  the 
initial  value  to  10~9radians  (that  is,  a thousand- fold  improvement).  What  is  the  time  horizon  now  for 
achieving  the  accuracy  of  10 ~2radiansl 

(c)  By  what  factor  has  imax  improved  with  the  1000  — fold  improvement  in  initial  measurement. 

(d)  What  does  this  imply  regarding  long-term  predictions  of  chaotic  motion? 


2.  A non-linear  oscillator  satisfies  the  equation  x + X3 *  + X = 0.  Find  the  polar  equations  for  the  motion  in  the 
state-space  diagram.  Show  that  any  trajectory  that  starts  within  the  circle  r < 1 encircle  the  origin  infinitely 
many  times  in  the  clockwise  direction.  Show  further  that  these  trajectories  in  state  space  terminate  at  the 
origin. 

3.  Consider  the  system  of  a mass  suspended  between  two  identical  springs  as  shown. 


If  each  spring  is  stretched  a distance  d to  attach  the  mass  at  the  equilibrium  position  the  mass  is  subject  to 
two  equal  and  oppositely  directed  forces  of  magnitude  nd.  Ignore  gravity.  Show  that  the  potential  in  which 
the  mass  moves  is  approximately 


Construct  a state-space  diagram  for  this  potential. 


Problems 

1.  A non-linear  oscillator  satisfies  the  equation 

x + ( x 2 + x2  — l)x  + x = 0 

Find  the  polar  equations  for  the  motion  in  the  state-space  diagram.  Show  that  any  trajectory  that  starts  in 
the  domain  1 < r < \/3  spirals  clockwise  and  tends  to  the  limit  cycle  r = 1.  [The  same  is  true  of  trajectories 
that  start  in  the  domain  0 < r < 1.  ] What  is  the  period  of  the  limit  cycle? 

2.  A mass  m moves  in  one  direction  and  is  subject  to  a constant  force  -fi^o  when  x < 0 and  to  a constant  force 

— i*0  when  x > 0.  Describe  the  motion  by  constructing  a state  space  diagram.  Calculate  the  period  of  the 

motion  in  terms  of  TO,  Fq  and  the  amplitude  A.  Disregard  damping. 


Chapter  5 


Calculus  of  variations 


5.1  Introduction 

The  prior  chapters  have  focussed  on  the  intuitive  Newtonian  perspective  of  classical  mechanics,  which  is 
based  on  vector  quantities  like  force,  momentum,  and  acceleration.  Newtonian  mechanics  leads  to  second- 
order  differential  equations  of  motion.  The  calculus  of  variations  underlies  a powerful  alternative  approach 
to  classical  mechanics  that  is  based  on  identifying  the  path  that  minimizes  an  integral  quantity.  This  integral 
variational  approach  was  first  championed  by  Gottfried  Wilhelm  Leibniz,  contemporaneously  with  Newton’s 
development  of  the  differential  approach  to  classical  mechanics. 

During  the  18th  century,  Bernoulli,  who  was  a student  of  Leibniz,  developed  the  field  of  variational 
calculus  which  underlies  the  integral  variational  approach  to  mechanics.  He  solved  the  brachistochrone 
problem  which  involves  finding  the  path  for  which  the  transit  time  between  two  points  is  the  shortest.  The 
integral  variational  approach  also  underlies  Fermat’s  principle  in  optics,  which  can  be  used  to  derive  that 
the  angle  of  reflection  equals  the  angle  of  incidence,  as  well  as  derive  Snell’s  law.  Other  applications  of  the 
calculus  of  variations  include  solving  the  catenary  problem,  finding  the  maximum  and  minimum  distances 
between  two  points  on  a surface,  polygon  shapes  having  the  maximum  ratio  of  enclosed  area  to  perimeter, 
or  maximizing  profit  in  economics.  Bernoulli,  developed  the  principle  of  virtual  work  used  to  describe 
equilibrium  in  static  systems,  and  d’Alembert  extended  the  principle  of  virtual  work  to  dynamical  systems. 
Euler,  the  preeminent  Swiss  mathematician  of  the  18th  century  and  a student  of  Bernoulli,  developed  the 
calculus  of  variations  with  full  mathematical  rigor.  Lagrange  (1736-1813), a student  of  Euler,  culminated  the 
development  of  the  Lagrangian  variational  approach  to  classical  mechanics. 

The  Euler-Lagrangian  approach  to  classical  mechanics  stems  from  a deep  philosophical  belief  that  the 
laws  of  nature  are  based  on  a principle  of  economy.That  is,  the  physical  universe  follows  paths  through  space 
and  time  that  are  based  on  extrema  principles.  The  standard  Lagrangian  L is  defined  as  the  difference 
between  the  kinetic  and  potential  energy,  that  is 

L = T-U  (5.1) 

Chapters  6 and  13  will  show  that  the  laws  of  classical  mechanics  can  be  expressed  in  terms  of  Hamilton’s 
variational  principle  which  states  that  the  motion  of  the  system  between  the  initial  time  Hand  final  time 
H follows  a path  that  minimizes  the  scalar  action  integral  S defined  as  the  time  integral  of  the  Lagrangian. 

S=  [ Ldt  (5.2) 

Jti 

The  calculus  of  variations  provides  the  mathematics  required  to  determine  the  path  that  minimizes  the 
action  integral.  This  variational  approach  is  both  elegant  and  beautiful,  and  has  withstood  the  rigors 
of  experimental  confirmation.  In  fact,  not  only  is  it  an  exceedingly  powerful  alternative  approach  to  the 
intuitive  Newtonian  approach  to  mechanics,  but  Hamilton’s  variational  principle  now  is  recognized  to  be 
more  fundamental  than  Newton’s  Laws  of  Motion.  The  Lagrangian  and  Hamiltonian  variational  approaches 
to  mechanics  are  the  only  approaches  that  can  handle  the  Theory  of  Relativity,  statistical  mechanics,  and 
the  dichotomy  of  philosophical  approaches  to  quantum  physics. 


Ill 
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5.2  Euler’s  differential  equation 

The  calculus  of  variations,  presented  here,  underlie  the  powerful  variational  approaches  that  were  developed 
for  classical  mechanics.  Variational  calculus  now  is  applied  to  many  other  disciplines  in  science,  engineering, 
economics,  and  medicine. 

For  the  special  case  of  one  dimension,  the  calculus  of  variations  reduces  to  varying  the  function  y(x)  such 
that  the  scalar  functional  F 

rx  2 

F = f[y(x),y'(x);x]dx  (5.3) 

J Xi 

is  an  extremum,  that  is,  it  is  a maximum  or  minimum.  Here  x is  the  independent  variable,  y(x)  the  dependent 
variable,  plus  its  first  derivative  y'  = The  quantity  / [y{x),  y'(x);  x]  has  some  given  dependence  on  y,y' 
and  x.  The  calculus  of  variations  involves  varying  the  function  y(x)  until  a stationary  value  of  F is  found, 
which  is  presumed  to  be  an  extremum.  This  means  that  if  a function  y = y(x)  gives  a minimum  value  for  the 
scalar  functional  F,  then  any  neighboring  function,  no  matter  how  close  to  y(x),  must  increase  F.  For  all 
paths,  the  integral  F is  taken  between  two  fixed  points,  X\,y\  and  £2,2/2-  Possible  paths  between  the  initial 
and  final  points  are  illustrated  in  figure  5.1.  Relative  to  any  neighboring  path,  the  functional  F must  have 
a stationary  value  which  is  presumed  to  be  the  correct  extremum  path. 

Define  a neighboring  function  using  a parametric  representation  y(e,  x)  such  that  e = 0,  y = y( 0,  x)  = y(x) 
is  the  function  that  yields  the  extremum  for  F.  Assume  that  an  infinitesimally  small  fraction  e of  the 
neighboring  function  r](x)  is  added  to  the  extremum  path  y( x).  That  is,  assume 


= 2/(0,  x)  + erj(x)  (5.4) 

= dy{e,x)  = dy(0,  x)  | ^dy 
dx  dx  dx 

where  it  is  assumed  that  the  extremum  function  y(0,x)  and  the  auxiliary  function  y(x)  are  well  behaved 
functions  of  x with  continuous  first  derivatives,  and  where  y(x)  vanishes  at  x\  and  X2,  because  for  all  possible 
paths  the  function  y(e ,x)  must  be  identical  with  y(x)  at  the  end  points  of  the  path,  i.e.  y(xi)  = y(x 2)  = 0. 
The  situation  is  depicted  in  figure  5.1. 

It  is  possible  to  express  any  such  parametric  family  of  curves  F as  a function  of  e 

rx  2 

F{e)  = / f[y(e,x),y'(e,x);x\dx  (5.5) 

J X\ 

The  condition  that  the  integral  has  a stationary  (extremum)  value  is  that  F be  independent  of  e to  first 
order  along  the  path  giving  the  extremum  value  (e  = 0).  That  is 


y(cx) 

y'(cx) 


e=0 


= 0 


for  all  functions  y{x).  This  is  illustrated  on  the  right  side  of  figure  5.1. 

Applying  condition  (5.6)  to  equation  (5.5) , and  since  x is  independent  of  e,  then 


(5.6) 


OF 

Ik 


r (d£dy 
JX1  V dy  de 


dy ' 8e  ) 


dx  = 0 


(5.7) 


Since  the  limits  of  integration  are  fixed,  the  differential  operation  affects  only  the  integrand.  From  equations 


(5.4), 

% = 

(5.8) 

and 

dy'  dy 
de  dx 

(5.9) 

Consider  the  second  term  in  the  integrand 


fX2  dj_d^_,  _ fX2  dldy 
4 dy1  de  X JXl  dy1  dx 


(5.10) 


5.2.  EULER’S  DIFFERENTIAL  EQUATION 
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Figure  5.1:  The  left  shows  the  extremum  y(x)  and  neighboring  paths  y{e,  x)  = y(x)  +er](x)  between  (xi,yi) 
and  (x2, 2/2)  that  minimizes  the  function  F = J'*2  f [y(x),  y'(x)\ x]  dx.  The  right  shows  the  dependence  of  F 
as  a function  of  the  admixture  coefficient  e for  a maximum  (upper)  or  a minimum  (lower)  at  e = 0. 


Integrate  by  parts 


udv  = uv 


vdu 


(5.11) 


gives 


Note  that  the  first  term  on  the  right-hand  side  is  zero  since  by  definition 


3/ 

d y' 

dy 

de 


dx  (5.12) 

rj(x)  = 0 at  x\  and  x^-  Thus 


dF 


[X2  ( d£dy 

JXl  v dy  de 


d£d]/\ 

dy1  de  ) 


9f  , , , , d 


dx 


Thus  equation  5.7  reduces  to 


dF 

Ik 


[X2  ( df_ 

L \dy 


±df\ 

dx  dy' ) 


rj(x)dx 


(5.13) 


The  function  will  be  an  extremum  if  it  is  stationary  at  e = 0.  That  is, 


dF 

Ik 


[X2  ( df_ 

L \dy 


±df\ 

dx  dy’ ) 


t](x)dx  = 0 


(5.14) 


This  integral  now  appears  to  be  independent  of  e.  However,  the  functions  y and  y'  occurring  in  the  derivatives 
are  functions  of  e.  Since  (4^)  must  vanish  for  a stationary  value,  and  because  r](x)  is  an  arbitrary  function 
subject  to  the  conditions  stated,  then  the  above  integrand  must  be  zero.  This  derivation  that  the  integrand 
must  be  zero  leads  to  Euler’s  differential  equation 

dy  dx  dy' 

where  y and  y'  are  the  original  functions,  independent  of  e.  The  basis  of  the  calculus  of  variations  is  that  the 
function  y(x)  that  satisfies  Euler’s  equation  is  an  stationary  function.  Note  that  the  stationary  value  could 
be  either  a maximum  or  a minimum  value.  When  Euler’s  equation  is  applied  to  mechanical  systems  using 
the  Lagrangian  as  the  functional,  then  Euler’s  differential  equation  is  called  the  Euler-Lagrange  equation. 
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5.3  Applications  of  Euler’s  equation 

5.1  Example:  Shortest  distance  between  two  points 

Consider  the  path  lies  in  the  x — y plane.  The  infinitessimal  length  of  arc  is 


ds  = \/  dx 2 + dy 2 


dx 


Then  the  length  of  the  arc  is 


The  function  f is 


f = + 


Therefore 


and 


df 

7T  = ° 
dy 


Sf 


dy'  fNUf 

Inserting  these  into  Euler’s  equation  5.15  gives 


y 


Shortest  distance  between  two  points  in  a plane. 


that  is 

y' 

— : = constant  = C 

V1  + {y'f 
This  is  valid  if 

, c 

y = , = a 

Therefore 

y = ax  + b 

which  is  the  equation  of  a straight  line  in  the  plane.  Thus  the  shortest  path  between  two  points  in  a plane  is 
a straight  line  between  these  points,  as  is  intuitively  obvious.  This  stationary  value  obviously  is  a minimum. 

This  trivial  example  of  the  use  of  Euler’s  equation  to  determine  an  extremum  value  has  given  the  obvious 
answer.  It  has  been  presented  here  because  it  provides  a proof  that  a straight  line  is  the  shortest  distance  in 
a plane  and  illustrates  the  power  of  the  calculus  of  variations  to  determine  extremum  paths. 


5.2  Example:  Brachistochrone  problem 

The  Brachistochrone  problem  involves  finding  the  path  having  the  minimum  transit  time  between  two 
points.  The  Brachistochrone  problem  stimulated  the  development  of  the  calculus  of  variations  by  John 
Bernoulli  and  Euler.  For  simplicity,  take  the  case  of  frictionless  motion  in  the  x — y plane  with  a uni- 
form gravitational  field  acting  in  the  y direction,  as  shown  in  the  adjacent  figure.  The  question  is  what 
constrained  path  will  result  in  the  minimum  transit  time  between  two  points  {x\y{)  and  (xzyz). 


5.3.  APPLICATIONS  OF  EULER  'S  EQUATION 
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Consider  that  the  particle  of  mass  m starts  at  the  origin  x\  = 0,  yi  = 0 with  zero  velocity.  Since  the 
problem  conserves  energy  and  assuming  that  initially  E = KE  + PE  = 0 then 

i mv 2 — mgy  = 0 

That  is 

v = a/2 ~gy 

The  transit  time  is  given  by 

fX2  rfs  _ fX2  sjdx1  + dy 2 _ fX2  I (1  + x'2)  J 

JXl  v JX1  y/2 gy  JXl  y 2gy  V 

where  x'  = . -/Vote  that,  in  this  example,  the  independent  variable  has  been  chosen  to  be  y and  the  dependent 

variable  is  x(y). 

The  function  f of  the  integral  is 

1 /( 1 + z'2) 

; v^V  y 

Factor  out  the  constant  yj2g  term,  which  does  not  affect  the  final  equation,  and  note  that 


d£ 

dx 

<21 
dx ’ 


Therefore  Euler’s  equation  gives 


or 


= constant 


1 


y/2a 


That  is 


y ^1  + (a:')2^  2« 

This  may  be  rewritten  as 


fV2  ydy 

yi  \J^ay  — y2 


Change  the  variable  to  y = a(l  — cos  9)  gives 
that  dy  = asmddd,  leading  to  the  integral 


y 


The  Bachistochrone  problem  involves  finding  the  path  for 
the  minimum  transit  time  for  constrained  frictionless 
motion  in  a uniform  gravitational  field. 


x = 


— cos  9)  dd 


or 


x = a(9  — sin#)  + constant 
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The  parametric  equations  for  a cycloid  passing  through  the  origin  are 

x = a(9  — sin0) 
y = a(l  — cos0) 

which  is  the  form  of  the  solution  found.  That  is,  the  shortest  time  between  two  points  is  obtained  by  con- 
straining the  motion  of  the  mass  to  follow  a cycloid  shape.  Thus  the  mass  first  accelerates  rapidly  by  falling 
down  steeply  and  then  follows  the  curve  and  coasts  upward  at  the  end.  The  elapsed  time  is  obtained  by 
inserting  the  above  parametric  relations  for  x and  y,  in  terms  of  6,  into  the  transit  time  integral  giving 
t = where  a and  9 are  fixed  by  the  end  point  coordinates.  Thus  the  time  to  fall  from  starting  with  zero 

velocity  at  the  cusp  to  the  minimum  of  the  cycloid  is  If  V 2=2/1  = 0 then  X2  = 27r a which  defines  the 

shape  of  the  cycloid  and  the  minimum  time  is  27t./|  = If  the  mass  starts  with  a non- zero  initial 

velocity,  then  the  starting  point  is  not  at  the  cusp  of  the  cycloid,  but  down  a distance  d such  that  the  kinetic 
energy  equals  the  potential  energy  difference  from  the  cusp. 

A modern  application  of  the  Brachistochrone  problem  is  determination  of  the  optimum  shape  of  the  low- 
friction  emergency  chute  that  passengers  slide  down  to  evacuate  a burning  aircraft.  Bernoulli  solved  the 
problem  of  rapid  evacuation  of  an  aircraft  two  centuries  before  the  first  flight  of  a powered  aircraft. 


5.3  Example:  Minimal  travel  cost 

Assume  that  the  cost  of  flying  an  aircraft,  at  height  z is  e~KZ  per  unit  distance  of  flight-path,  where  k is  a 
positive  constant.  Consider  that  the  aircraft  flies  in  the  (. x , z)-plane  from  the  point  (—a,  0)  to  the  point  (a,  0) 
where  z = 0 corresponds  to  ground  level,  and  where  the  z-axis  points  vertically  upwards.  Find  the  extremal 
for  the  problem  of  minimizing  the  total  cost  of  the  journey. 

The  differential  arc-length  element  of  the  flight  path  ds  can  be  written  as 

ds  = \J  dx2  + dz2  = \Jl  + z'2dx 

where  z’  = • Thus  the  cost  integral  to  be  minimized  is 

C=  f+  e~KZds=  f+  e“KVl  + z,2dx 

J —a  J —a 

The  function  of  this  integral  is 

f = e~KZVl  + z12 

The  partial  differentials  required  for  the  Euler  equations  are 

d df  z"e~KZ  Kz'2e~KZ  z"z'2e~KZ 

dx  dz'  a/1  + z'2  \/l  + z'2  (1  + z'2)3/2 

= -Ke~KZ\/\Tz'2 

dz 

Therefore  Euler’s  equation  equals 

df  d df  KT  r — z" e~KZ  ftz,2e-KZ  z,,z,2e_'t2 

dz  dx  dz'  ~ Vl  + z'2  Vl  + z'2  (1  + z'2f/2 

This  can  be  simplified  by  multiplying  the  radical  to  give 

— K — 2 KZ'2  — KZ'4  — z"  — z”  z!2  + Kz'2  + Kz'4  + z"  z'2  = 0 


Cancelling  terms  gives 


Separating  the  variables  leads  to 


z"  + k (1  + z'2)  = 0 


/ 


dz' 


J ndx  = —kz  + ci 


arctan  z‘ 


z'2  + 1 


5.4.  SELECTION  OF  THE  INDEPENDENT  VARIABLE 
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Integration  gives 


z{x)  = 


tan(ci  — nx)dx  = 


ln(cos(ci  — kx ))  — ln(cos(ci  + na)) 

K 


+ C2 


/ CQS(C1— 

lcos(ci+Aca)  J 


ft 


+ C2 


Using  the  initial  condition  that  z(—a)  = 0 gives  c2  = 0.  Similarly  the  final  condition  z(a)  = 0 implies  that 
ci  = 0.  Thus  Euler’s  equation  has  determined  that  the  optimal  trajectory  that  minimizes  the  cost  integral  C 
is 


, , 1.  / cos(«;;r) 

z{ x)  = - In  — - 

ft  \COS(ftO) 

This  example  is  typical  of  problems  encountered  in  economics. 


5.4  Selection  of  the  independent  variable 

A wide  selection  of  variables  can  be  chosen  as  the  independent  variable  for  variational  calculus.  The  derivation 
of  Euler’s  equation  and  example  5.1  both  assumed  that  the  independent  variable  is  x.  whereas  example 
5.2  used  y as  the  independent  variable,  example  5.3  used  z,  and  Lagrange  mechanics  uses  time  t as  the 
independent  variable.  Selection  of  which  variable  to  use  as  the  independent  variable  does  not  change  the 
physics  of  a problem,  but  some  selections  can  simplify  the  mathematics  for  obtaining  an  analytic  solution. 
The  following  example  of  a cylindrically-symmetric  soap-bubble  surface  formed  by  blowing  a soap  bubble  that 
stretches  between  two  circular  hoops,  illustrates  the  importance  when  selecting  the  independent  variable. 

5.4  Example:  Surface  area  of  a cylindrically-symmetric  soap  bubble 

Consider  a cylindrically-symmetric  soap-bubble  surface 
formed  by  blowing  a soap  bubble  that  stretches  between  two 
circular  hoops.  The  surface  energy,  that  results  from  the  sur- 
face tension  of  the  soap  biLbble,  is  minimized  when  the  surface 
area  of  the  bubble  is  minimized.  Assume  that  the  axes  of  the 
two  hoops  lie  along  the  z axis  as  shown  in  the  adjacent  figure. 

It  is  intuitively  obvious  that  the  soap  bubble  having  the  mini- 
mum surface  area  that  is  bounded  by  the  two  hoops  will  have 
a circular  cross  section  that  is  concentric  with  the  symmetry 
axis,  and  the  radius  will  be  smaller  between  the  two  hoops. 

Therefore,  intuition  can  be  used  to  simplify  the  problem  to 
finding  the  shape  of  the  contour  of  revolution  around  the  axis 
of  symmetry  that  defines  the  shape  of  the  surface  of  minimum 
surface  area.  Use  cylindrical  coordinates  ( p,9,z ) and  assume 
that  hoop  1 at  z\  has  radius  px  and  hoop  2 at  z2  has  radius 
p2-  Consider  the  cases  where  either  p,  or  z,  are  selected  to 
be  the  independent  variable. 

The  differential  arc-length  element  of  the  circular  annu- 
lus  at  constant  9 between  z and  z + dz  is  given  by  ds  = 

\J dz2  + dp2.  Therefore  the  area  of  the  infinites simal  circular 
annulus  is  dS  = 2npds  which  can  be  integrated  to  give  the 
area  of  the  surface  S of  the  soap  bubble  bounded  by  the  two 
circular  hoops  as 

r 2 

S = J p\J  dz 2 + dp 2 


z 


Cylindrically-symmetric  surface  formed  by 
rotation  about  the  z axis  of  a soap  bubble 
suspended  between  two  identical  hoops 
centred  on  the  2 axis. 


Independent  variable  z 

Assuming  that  z is  the  independent  variable,  then  the  surface  area  can  be  written  as 

S = 2tt  J pJ  1 + dz  = 27t  p\J\  + p'2dz 
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where  p'  = The  function  of  the  surface  integral  is  f = p\J  1 + p'2.  The  derivatives  are 

d£ 

dp 

df 


= V1  + p‘ 


and 


PP 


Therefore  Euler’s  equation  gives 


d_ 

dz 


PP 


Vi+(^')2 


y/1  + pr2=  0 


77ns  is  not  an  easy  equation  to  solve. 

Independent  variable  p 

Consider  the  case  where  the  independent  variable  is  chosen  to  be  p,  then  the  surface  integral  can  be  written 
as 


S = 2n  j pJ  1 + dp  = 2tt  J p\J  1 + z'2dp 


where  z'  = ^ . Thus  the  function  of  the  surface  integral  is  f = pV  1 + z'2 . The  derivatives  are 


and 


Therefore  Euler’s  equation  gives 


That  is 


df_ 

dz' 


s-° 


pz 


pz' 


dp\fH^2 


= 0 


pz 


= a 


where  a is  a constant.  This  can  be  rewritten  as 

z'2  ( p 2 - a2)  = a2 


or 


, dz 


a 


The  integral  of  this  is 


That  is 


dp  \J p2  — a2 

-l  ( P^ 


z = a cosh 


p = a cosh 


z-b 

a 


which  is  the  equation  of  a catenary.  The  catenary  is  the  shape  of  a uniform  flexible  cable  hung  in  a uniform 
gravitational  field.  The  constants  a and  b are  given  by  the  end  points.  The  physics  of  the  solution  must  be 
identical  for  either  choice  of  independent  variable.  However,  mathematically  one  case  is  easier  to  solve  than 
the  other  because,  in  the  latter  case,  one  term  in  Euler’s  equation  is  zero. 


5.5.  FUNCTIONS  WITH  SEVERAL  INDEPENDENT  VARIABLES  Y,(X) 


119 


5.5  Functions  with  several  independent  variables  yi{x) 

The  discussion  has  focussed  on  systems  having  only  a single  function  y{x)  such  that  the  functional  is  an 
extremum.  It  is  more  common  to  have  a functional  that  is  dependent  upon  several  independent  variables 
f [yi(x),y'1(x),y2(x),y'2(x),  ....;x\  which  can  be  written  as 

rx 2 N 

F=  ^jf[yi{x),y'i(x)\x]dx  (5.16) 

Jxi  i=  l 


where  i = 1,2, 3, ....,  N. 

By  analogy  with  the  one  dimensional  problem,  define  neighboring  functions  rii  for  each  variable.  Then 

yt(e,x)  = yi(0,x)  +erji(x)  (5.17) 

, dyi{e,x ) dyi(0,x)  dy.t 

y'(e-x)  = = 

where  77,  are  independent  functions  of  x that  vanish  at  x\  and  X2-  Using  equations  5.12  and  5.17  leads  to 
the  requirements  for  an  extremum  value  to  be 


— = 

JXl  4"  de 


±df\ 

dx  dy\ ) 


yi(x)dx  = 0 


(5.18) 


If  the  variables  yfix)  are  independent,  then  the  rn{x)  are  independent.  Since  the  yfix)  are  independent, 
then  evaluating  the  above  equation  at  e = 0 implies  that  each  term  in  the  bracket  must  vanish  independently. 
That  is,  Euler’s  differential  equation  becomes  a set  of  N equations  for  the  N independent  variables 

dl_±dl 

dyi  dx  dy[ 

where  i = 1,2,3 ..N.  Thus,  each  of  the  N equations  can  be  solved  independently  when  the  N variables  are 
independent.  Note  that  Euler’s  equation  involves  partial  derivatives  for  the  dependent  variables  j/j  , y'{  and 
the  total  derivative  for  the  independent  variable  x. 


(5.19) 


5.5  Example:  Fermat’s  Principle 

In  1662  Fermat’s  proposed  that  the  propagation  of 
light  obeyed  the  generalized  principle  of  least  transit  time. 

In  optics,  Fermat’s  principle,  or  the  principle  of  least 
time,  is  the  principle  that  the  path  taken  between  two 
points  by  a ray  of  light  is  the  path  that  can  be  traversed  in 
the  least  time.  Historically,  the  proof  of  Fermat’s  princi- 
ple by  Johann  Bernoulli  was  one  of  the  first  triumphs  of 
the  calculus  of  variations,  and  served  as  a guiding  princi- 
ple in  the  formulation  of  physical  laws  using  variational 
calculus. 

Consider  the  geometry  shown  in  the  figure,  where 
the  light  travels  from  the  point  Pi(0,yi,0)  to  the  point 
P2(x2,  — 2/2, 0).  The  light  beam  intersects  a plane  glass 
interface  at  the  point  Q(x,0,z). 

The  French  mathematician  Fermat  discovered  that 
the  required  path  travelled  by  light,  is  the  path  for  which 
the  travel  time  t is  a minimum.  That  is,  the  transit  time  from  the  initial  point  Pi  to  the  final  point  P2  is 
given  by 

/2  n 2 1 ^ 1 ^ / 

dt  = J — = - J nds  = - J n(x,y,  z)y  1 + (x1)2  + (z')2dy 

assuming  that  the  velocity  of  light  in  any  medium  is  given  by  v = c/n  where  n is  the  refractive  index  of  the 
medium  and  c is  the  velocity  of  light  in  vacuum. 


Light  incident  upon  a plane  glass  interface  in  the 
( x , y)  plane  at  y = 0. 
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This  is  a problem  that  has  two  dependent  variables  x(y)  and  z(y)  with  y chosen  as  the  independent 
variable.  The  integral  can  be  broken  into  two  parts  t/i  — 0 and  0 — > — y2 . 


1 

t=  — 
c 


[ niJl  + (x'f  + {z'fdy  + f n2J  1 + ( x'f  + ( z')‘ 

VJ/1  JO 


dy 


The  functionals  are  functions  of  x'  and  z'  but  not  x or  z.  Thus  Euler’s  equation  for  z simplifies  to 


d (\,  n\z'  n2z' 


s)  =° 


dy  \c  sfl  + x 2 + z'2  Vi  + x'2  + z'2 ' 

This  implies  that  z’  = 0,  therefore  z is  a constant.  Since  the  initial  and  final  values  were  chosen  to  be 
z\  = z2  = 0,  therefore  at  the  interface  z = 0.  Similarly  Euler’s  equations  for  x are 


n\x 


n2X’  ,))=0 
f x12  + z2  ) 


dy  \ c v \f\  + x"2  + z’2  Vl  + : 

But  x'  = tan  for  m and  x'  = — tan02  for  n2  and  it  was  shown  that  z’  = 0.  Thus 


„ dll,  n-|  tan0-| 

°+A7.\  -( 


n2 tan  d2 


dy  \C  \Jl-\-  (tan0i)2  ^l  + (tan d2f  ) dy  Vc 


) I =4-  (~(ni  sin 0i  - n2  sin02)  ) = 0 


Therefore  ^(?zisin0i  — n2sind2)  = constant  which  must  be  zero  since  when  n\  = n2,  then  6 1 = 02.  Thus 
Fermat’s  principle  leads  to  Snell’s  Law. 

n\  sin  0i  = n2  sin02 

The  geometry  of  this  problem  is  simple  enough  to  directly  minimize  the  path  rather  than  using  Euler’s 
equations  for  the  two  parameters  as  performed  above.  The  lengths  of  the  paths  P\Q  and  QP2  are 


PiQ  = \Jx2  + yl  + z 2 


QP2  = \/(x2-  x)2  +y2  + z2 


The  total  transit  time  is  given  by 

t=^  ^ni  \Jx2  +y2  + z2  + n2  \J [x2  - x)2  +y2  + z2 

This  problem  involves  two  dependent  variables,  y{ x)  and  z{x).  To  find  the  minima,  set  the  partial  derivatives 
= 0 and  ||  = 0.  That  is, 


dt  1 


n\z 


n2z 


= 0 


OZ  C +y2  + z 2 ^X2_xf+y2  + z2' 

This  is  zero  only  if  z = 0,  that  is  the  point  Q lies  in  the  plane  containing  Pi  and  P2.  Similarly 

n\x 


dt  1 


d>X  ° + 0i  + z2  ^ J [x2  — x)2  + y%  + z2 

This  is  zero  only  if  Snell’s  law  applies  that  is 


n2{x2  -x)  l . 

) = - (m  sm0i  — n2  sm02)  = 0 
c 


rii  sin  0i  = n2  sin02 

Fermat’s  principle  has  shown  that  the  refracted  light  is  given  by  Snell’s  Law,  and  is  in  a plane  normal  to  the 
surface.  The  laws  of  reflection  also  are  given  since  then  n\  = n2  = n and  the  angle  of  reflection  equals  the 
angle  of  incidence. 


5.6.  EULER’S  INTEGRAL  EQUATION 
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5.6  Example:  Minimum  of  (VC)2  in  a volume 

Find  the  function  cj>{x  1,2:2,  £3)  that  has  the  minimum  value  of  (V</>)2  per  unit  volume.  For  the  volume 
V it  is  desired  to  minimize  the  following 


J = v 


2 1 

( V<f> ) dx±dx2dx3  = — 


d(j> 

dxi 


d(j> 

dx2 


d<f> 

dx3 


dx\dx2dx3 


Note  that  the  variables  xi,x2,X3  are  independent,  and  thus  Euler’s  equation  for  several  independent  variables 
can  be  used.  To  minimize  the  functional  J,  the  function 


f = 


d (j> 

dxi 


+ 


(WX 

\dx2) 


d<f> 

dx3 


(a) 


must  satisfy  the  Euler  equation 


where  (j)'  = Substitute  f into  Euler’s  equation  gives 


This  is  just  Laplace ’s  equation 


V2<p  = 0 


Therefore  <j>  must  satisfy  Laplace ’s  equation  in  order  that  the  functional  J be  a minimum. 


5.6  Euler’s  integral  equation 


An  integral  form  of  the  Euler  differential  equation  can  be  written  which  is  useful  for  cases  when  the  function 
/ does  not  depend  explicitly  on  the  independent  variable  x,  that  is,  when  fy  = 0.  Note  that 


df  = df  | df  dy  | df  dy' 
dx  dx  dy  dx  dy'  dx 

But 

dx  \ dy'  J dy'  dx  y dx  dy' 

Combining  these  two  equations  gives 

±(  'd£\  = _d£_  ,df  ,d_df_ 

dx  v dy'  J dx  dx  ^ dy  ^ dx  dy' 

The  last  two  terms  can  be  rewritten  as 

,(±di_d£ 

^ \ dx  dy'  dy 

which  vanishes  when  the  Euler  equation  is  satisfied.  Therefore  the  above  equation  simplifies  to 


(5.20) 

(5.21) 


(5.22) 

(5.23) 


A _A(/. 

dx  dx 


dy ' 


= 0 


(5.24) 


This  integral  form  of  Euler’s  equation  is  especially  useful  when  jf,  = 0,  that  is,  when  f does  not  depend 
explicitly  on  the  independent  variable  x.  Then  the  first  integral  of  equation  5.24  is  a constant,  i.e. 


, , df 

J — y — — = constant 
dy ' 


(5.25) 


This  is  Euler’s  integral  variational  equation.  Note  that  the  shortest  distance  between  two  points,  the  mini- 
mum surface  of  rotation,  and  the  brachistochrone,  described  earlier,  all  are  examples  where  = 0 and  thus 
the  integral  form  of  Euler’s  equation  is  useful  for  solving  these  cases. 
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5.7  Constrained  variational  systems 

Imposing  a constraint  on  a variational  system  implies: 

1.  The  N constrained  coordinates  y%(x)  are  correlated  which  violates 
the  assumption  made  in  chapter  5.5  that  the  N variables  are  inde- 
pendent. 

2.  Constrained  motion  implies  that  constraint  forces  must  be  acting 
to  account  for  the  correlation  of  the  variables.  These  constraint 
forces  must  be  taken  into  account  in  the  equations  of  motion. 

For  example,  for  a disk  rolling  down  an  inclined  plane  without  slip- 
ping, there  are  three  coordinates  x [perpendicular  to  the  wedge],  y,  [Along 
the  surface  of  the  wedge],  and  the  rotation  angle  9 shown  in  figure  5.2. 

The  constraint  forces,  F f N,  lead  to  the  correlation  of  the  variables  such 
that  x = R,  while  y = R9.  Basically  there  is  only  one  independent  vari- 
able, which  can  be  either  y or  9.  The  use  of  only  one  independent  variable 
essentially  buries  the  constraint  forces  under  the  rug,  which  is  hue  if  you 
only  need  to  know  the  equation  of  motion.  If  you  need  to  determine  the 
forces  of  constraint  then  it  is  necessary  to  include  all  coordinates  explicitly  in  the  equations  of  motion. 

5.7.1  Holonomic  constraints 

Most  problems  involve  restrictions  or  constraints  that  couple  the  coordinates.  For  example,  the  y,;  (x)  may 
be  confined  to  a surface  in  coordinate  space.  The  constraints  mean  that  the  coordinates  yi(x)  are  not  inde- 
pendent, but  are  related  by  equations  of  constraint.  A constraint  is  called  holonomic  if  the  equations  of 
constraint  can  be  expressed  in  the  form  of  an  algebraic  equation  that  directly  and  unambiguously  specifies 
the  shape  of  the  surface  of  constraint.  A non-holonomic  constraint  does  not  provide  an  algebraic  relation 
between  the  correlated  coordinates.  In  addition  to  the  holonomy  of  the  constraints,  the  equations  of  con- 
straint also  can  be  grouped  into  the  following  three  classifications  depending  on  whether  they  are  algebraic, 
differential,  or  integral.  These  different  equations  of  constraint  exhibit  different  holonomy  in  the  relation 
between  the  coupled  coordinates.  Fortunately  the  solution  of  constrained  systems  is  greatly  simplified  if  the 
equations  of  constraint  are  holonomic. 

5.7.2  Geometric  (algebraic)  equations  of  constraint 

Geometric  constraints  can  be  expressed  in  the  form  of  algebraic  relations  that  directly  specify  the  shape  of 
the  surface  of  constraint  in  coordinate  space  qi,q2,---,  qj,  ■■ qn ■ 

9k{qi,  Q2,  -qjt  ■■qn.;  t)  = o (5.26) 

where  j = 1,  2, 3,  ...n.  There  can  be  m such  equations  of  constraint  where  0 < k < m.  An  example  of  such  a 
geometric  constraint  is  when  the  motion  is  confined  to  the  surface  of  a sphere  of  radius  R in  coordinate  space 
which  can  be  written  in  the  form  g = x2  + y2  + z2  — R2  = 0.  Such  algebraic  constraint  equations  are  called 
Holonomic  which  allows  use  of  generalized  coordinates  as  well  as  Lagrange  multipliers  to  handle  both  the 
constraint  forces  and  the  correlation  of  the  coordinates. 


N 


Figure  5.2:  A disk  rolling  down 
an  inclined  plane. 


5.7.3  Kinematic  (differential)  equations  of  constraint 

The  m constraint  equations  also  can  be  expressed  in  terms  of  the  infinitessimal  displacements  of  the  form 


± + dJzdt  = o 

jTt  dqj 


dt 


(5.27) 


where  k = 1,2,3,  ...to,  j = 1,2,3,  ...n.  If  equation  (5.27)  represents  the  total  differential  of  a function  then 
it  can  be  integrated  to  give  a holonomic  relation  of  the  form  of  equation  5.26.  However,  if  equation  5.27  is 
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not  the  total  differential,  then  it  is  non-holonomic  and  can  be  integrated  only  after  having  solved  the  full 
problem. 

An  example  of  differential  constraint  equations  is  for  a wheel  rolling  on  a plane  without  slipping  which  is 
non-holonomic  and  more  complicated  than  might  be  expected.  The  wheel  moving  on  a plane  has  five  degrees 
of  freedom  since  the  height  z is  fixed.  That  is,  the  motion  of  the  center  of  mass  requires  two  coordinates 
(x,  y)  plus  there  are  three  angles  (<p,  9 , ip)  where  <p  is  the  rotation  angle  for  the  wheel,  9 is  the  pivot  angle  of 
the  axis,  and  ip  is  the  tilt  angle  of  the  wheel.  If  the  wheel  slides  then  all  five  degrees  of  freedom  are  active. 
If  the  axis  of  rotation  of  the  wheel  is  horizontal,  that  is,  the  tilt  angle  ip  = 0 is  constant,  then  this  kinematic 
system  leads  to  three  differential  constraint  equations  The  wheel  can  roll  with  angular  velocity  <p,  as  well  as 
pivot  which  corresponds  to  a change  in  9.  Combining  these  leads  to  two  differential  equations  of  constraint 

dx  — a sin  9dcp  = 0 dy  + a cos  9dcp  = 0 (5.28) 

These  constraints  are  insufficient  to  provide  finite  relations  between  all  the  coordinates.  That  is,  the  con- 
straints cannot  be  reduced  by  integration  to  the  form  of  equation  5.26  because  there  is  no  functional  relation 
between  <p  and  the  other  three  variables,  x,  y,  9.  Many  rolling  trajectories  are  possible  between  any  two  points 
of  contact  on  the  plane  that  are  related  to  different  pivot  angles.  That  is,  the  point  of  contact  of  the  disk 
could  pivot  plus  roll  in  a circle  returning  to  the  same  point  where  x,  y,  9 are  unchanged  whereas  the  value 
of  <p  depends  on  the  circumference  of  the  circle.  As  a consequence  the  rolling  constraint  is  non-holonomic 
except  for  the  case  where  the  disk  rolls  in  a straight  line  and  remains  vertical. 

5.7.4  Isoperimetric  (integral)  equations  of  constraint 

Equations  of  constraint  also  can  be  expressed  in  terms  of  direct  integrals.  This  situation  is  encountered  for 
isoperimetric  problems,  such  as  finding  the  maximum  volume  bounded  by  a surface  of  fixed  area,  or  the 
shape  of  a hanging  rope  of  fixed  length.  Integral  constraints  occur  in  economics  when  minimizing  some  cost 
algorithm  subject  to  a fixed  total  cost  constraint. 

A simple  example  of  an  isoperimetric  problem  involves  finding  the  curve  y = y(x)  such  that  the  functional 
has  an  extremum  where  the  curve  y(x)  satisfies  boundary  conditions  such  that  y(x i)  = a and  y{x 2)  = b, 
that  is 

rx  2 

F(y)=  f{y,y'-,x)dx  (5.29) 

J Xl 

is  an  extremum  such  that  the  perimeter  also  is  constrained  to  satisfy 

nx  2 

G(y)=  g{y,y'-,x)dx  = l (5.30) 

J Xl 

where  l is  a fixed  length.  This  integral  constraint  is  geometric  and  holonomic.  Another  example  is  finding 
the  minimum  surface  area  of  a closed  surface  subject  to  the  enclosed  volume  being  the  constraint. 

5.7.5  Properties  of  the  constraint  equations 

Holonomic  constraints  Geometric  constraints  can  be  expressed  in  the  form  of  an  algebraic  equation 
that  directly  specifies  the  shape  of  the  surface  of  constraint 

5(2/1, 2/2, 2/3, a;)  = 0 (5.31) 

Such  a system  is  called  holonomic  since  there  is  a direct  relation  between  the  coupled  variables.  An  example 
of  such  a holonomic  geometric  constraint  is  if  the  motion  is  confined  to  the  surface  of  a sphere  of  radius  R 
which  can  be  written  in  the  form 

g = x2  + y2  + z2  - R2  = 0 (5.32) 

Non-holonomic  constraints  There  are  many  classifications  of  non-holonomic  constraints  that  exist 
if  equation  (5.31)  is  not  satisfied.  The  algebraic  approach  is  difficult  to  handle  when  the  constraint  is  an 
inequality,  such  as  the  requirement  that  the  location  is  restricted  to  lie  inside  a spherical  shell  of  radius  R 
which  can  be  expressed  as 


g = x2  + y2  + z2  - R2  < 0 


(5.33) 
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This  non-holonomic  constrained  system  has  a one-sided  constraint.  Systems  usually  are  non-holonomic  if 
the  constraint  is  kinematic  as  discussed  above. 


Partial  Holonomic  constraints  Partial-holonomic  constraints  are  holonomic  for  a restricted  range 
of  the  constraint  surface  in  coordinate  space,  and  this  range  can  be  case  specific.  This  can  occur  if  the 
constraint  force  is  one-sided  and  perpendicular  to  the  path.  An  example  is  the  pendulum  with  the  mass 
attached  to  the  fulcrum  by  a flexible  string  that  provides  tension  but  not  compression.  Then  the  pendulum 
length  is  constant  only  if  the  tension  in  the  string  is  positive.  Thus  the  pendulum  will  be  holonomic  if 
the  gravitational  plus  centrifugal  forces  are  such  that  the  tension  in  the  string  is  positive,  but  the  system 
becomes  non-hononomic  if  the  tension  is  negative  as  can  happen  when  the  pendulum  rotates  to  an  upright 
angle  where  the  centrifugal  force  outwards  is  insufficient  to  compensate  for  the  vertical  downward  component 
of  the  gravitational  force.  There  are  many  other  examples  where  the  motion  of  an  object  is  holonomic  when 
the  object  is  pressed  against  the  constraint  surface,  such  as  the  surface  of  the  Earth,  but  is  unconstrained  if 
the  object  leaves  the  surface. 


Time  dependence 

A constraint  is  called  scleronomic  if  the  constraint  is  not  explicitly  time  dependent.  This  ignores  the  time 
dependence  contained  within  the  solution  of  the  equations  of  motion.  Fortunately  a major  fraction  of 
systems  are  scleronomic.  The  constraint  is  called  rheonomic  if  the  constraint  is  explicitly  time  dependent. 
An  example  of  a rheonomic  system  is  where  the  size  or  shape  of  the  surface  of  constraint  is  explicitly  time 
dependent  such  as  a deflating  pneumatic  tire. 


Energy  conservation 

The  solution  depends  on  whether  the  constraint  is  conservative  or  dissipative,  that  is,  if  friction  or  drag  are 
acting.  The  system  will  be  conservative  if  there  are  no  drag  forces,  and  the  constraint  forces  are  perpendicular 
to  the  trajectory  of  the  path  such  as  the  motion  of  a charged  particle  in  a magnetic  field.  Forces  of  constraint 
can  result  from  sliding  of  two  solid  surfaces,  rolling  of  solid  objects,  fluid  flow  in  a liquid  or  gas,  or  result  from 
electromagnetic  forces.  Energy  dissipation  can  result  from  friction,  drag  in  a fluid  or  gas,  or  finite  resistance 
of  electric  conductors  leading  to  dissipation  of  induced  electric  currents  in  a conductor,  e.g.  eddy  currents. 

A rolling  constraint  is  unusual  in  that  friction  between  the  rolling  bodies  is  necessary  to  maintain  rolling. 
A disk  on  a frictionless  inclined  plane  will  conserve  it’s  angular  momentum  since  there  is  no  torque  acting 
if  the  rolling  contact  is  frictionless,  that  is,  the  disk  will  just  slide.  If  the  friction  is  sufficient  to  stop  sliding, 
then  the  bodies  will  roll  and  not  slide.  A perfect  rolling  body  does  not  dissipate  energy  since  no  work  is 
done  at  the  instantaneous  point  of  contact  where  both  bodies  are  in  zero  relative  motion  and  the  force  is 
perpendicular  to  the  motion.  In  real  life,  a rolling  wheel  can  involve  a very  small  energy  dissipation  due  to 
deformation  at  the  point  of  contact  coupled  with  non-elastic  properties  of  the  material  used  to  make  the 
wheel  and  the  plane  surface.  For  example,  a pneumatic  tire  can  heat  up  and  expand  due  to  flexing  of  the 
tire. 


5.7.6  Treatment  of  constraint  forces  in  variational  calculus 

There  are  three  major  approaches  to  handle  constraint  forces  in  variational  calculus.  All  three  of  them  exploit 
the  tremendous  freedom  and  flexibility  available  when  using  generalized  coordinates.  The  (1)  generalized 
coordinate  approach,  described  in  chapter  5.8,  exploits  the  correlation  of  the  n coordinates  due  to  the  m 
constraint  forces  to  reduce  the  dimension  of  the  equations  of  motion  to  s = n — m degrees  of  freedom.  This 
approach  embeds  the  m constraint  forces,  into  the  choice  of  generalized  coordinates  and  does  not  determine 
the  constraint  forces,  (2)  Lagrange  multiplier  approach,  described  in  chapter  5.9,  exploits  generalized 
coordinates  but  includes  the  m constraint  forces  into  the  Euler  equations  to  determine  both  the  constraint 
forces  in  addition  to  the  n equations  of  motion.  (3)  Generalized  forces  approach,  described  in  chapter 
6.7.3,  introduces  constraint  and  other  forces  explicitly. 
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5.8  Generalized  coordinates  in  variational  calculus 


Newtonian  mechanics  is  based  on  a vectorial  treatment  of  mechanics  which  can  be  difficult  to  apply  when 
solving  complicated  problems  in  mechanics.  Constraint  forces  acting  on  a system  usually  are  unknown,  and 
thus  must  be  included  explicitly  in  Newtonian  mechanics  so  that  they  can  be  determined  simultaneously 
with  the  solution  of  the  dynamical  equations  of  motion.  The  major  advantage  of  the  variational  approaches 
is  that  solution  of  the  dynamical  equations  of  motion  can  be  simplified  by  expressing  the  motion  in  terms  of 
n independent  generalized  coordinates.  These  generalized  coordinates  can  be  any  set  of  independent 
variables,  qi,  where  1 < i < n,  plus  the  corresponding  velocities  qi  for  Lagrangian  mechanics,  or  the 
corresponding  canonical  variables,  qi,Pi  for  Hamiltonian  mechanics.  These  generalized  coordinates  for  the 
n variables  are  used  to  specify  the  scalar  functional  dependence  on  these  generalized  coordinates.  The 
variational  approach  employs  the  scalar  functional  to  determine  the  trajectory.  The  generalized  coordinates 
used  for  the  variational  approach  do  not  need  to  be  orthogonal,  they  only  need  to  be  independent  since 
they  are  used  only  to  completely  specify  the  magnitude  of  the  scalar  functional.  This  greatly  expands 
the  arsenal  of  possible  generalized  coordinates  beyond  what  is  available  using  Newtonian  mechanics.  For 
example,  generalized  coordinates  can  be  the  dimensionless  amplitudes  for  the  n normal  inodes  of  coupled 
oscillator  systems,  or  action-angle  variables.  In  addition,  generalized  coordinates  having  different  dimensions 
can  be  used  for  each  of  the  n variables.  Each  generalized  coordinate,  qi  specifies  an  independent  mode  of  the 
system,  not  a specific  particle.  For  example,  each  normal  mode  of  coupled  oscillators  can  involve  correlated 
motion  of  several  coupled  particles.  The  major  advantage  of  using  generalized  coordinates  is  that  they  can 
be  chosen  to  be  perpendicular  to  a corresponding  constraint  force,  and  therefore  that  specific  constraint 
force  does  no  work  for  motion  along  that  generalized  coordinate.  Moreover,  the  constrained  motion  does  no 
work  in  the  direction  of  the  constraint  force  for  rigid  constraints.  Thus  generalized  coordinates  allow  specific 
constraint  forces  to  be  ignored  in  evaluation  of  the  minimized  functional.  This  freedom  and  flexibility  of  choice 
of  generalized  coordinate  allows  the  correlated  motion  produced  by  the  constraint  forces  to  be  embedded 
directly  into  the  choice  of  the  independent  generalized  coordinates,  and  the  actual  constraint  forces  can 
be  ignored.  Embedding  of  the  constraint  induced  correlations  into  the  generalized  coordinates,  effectively 
"sweeps  the  constraint  forces  under  the  rug"  which  greatly  simplifies  the  equations  of  motion  for  any  system 
that  involve  constraint  forces.  Selection  of  the  appropriate  generalized  coordinates  can  be  obvious,  and  often 
it  is  performed  subconsciously  by  the  user. 

Three  variational  approaches  are  used  that  employ  generalized  coordinates  to  derive  the  equations  of 
motion  of  a system  that  has  n generalized  coordinates  subject  to  m constraints. 

1)  Minimal  set  of  generalized  coordinates:  When  the  m equations  of  constraint  are  holonomic,  then 
the  to  algebraic  constraint  relations  can  be  used  to  transform  the  coordinates  into  s = n — m independent 
generalized  coordinates  qi.  This  approach  reduces  the  number  of  unknowns,  n,  by  the  number  of  constraints 
to,  to  give  a minimal  set  of  s = n — m independent  generalized  dynamical  variables.  The  forces  of  constraint 
are  not  explicitly  discussed,  or  determined,  when  this  generalized  coordinate  approach  is  employed.  This 
approach  greatly  simplifies  solution  of  dynamical  problems  by  avoiding  the  need  for  explicit  treatment  of  the 
constraint  forces.  This  approach  is  straight  forward  for  holonomic  constraints,  since  the  n spatial  coordinates 
yi(x),  ...yN(x),  are  coupled  by  m algebraic  equations  which  can  be  used  to  make  the  transformation  to 
generalized  coordinates.  Thus  the  n coupled  spatial  coordinates  are  transformed  to  s = n — to  independent 
generalized  dynamical  coordinates  qi(x),  ....qs(x),  and  their  generalized  first  derivatives  qi(x),  ....qs(x).  These 
generalized  coordinates  are  independent,  and  thus  it  is  possible  to  use  Euler’s  equation  for  each  independent 
parameter  qi 


dl_±dl 

dqi  dx  dql 


(5.34) 


where  i = 1,2, 3..s.  There  are  s = n — to  such  Euler  equations.  The  freedom  to  choose  generalized  coordinates 
underlies  the  tremendous  advantage  of  applying  the  variational  approach. 

2)  Lagrange  multipliers:  The  n Lagrange  equations,  plus  the  to  equations  of  constraint,  can  be  used 
to  explicitly  determine  the  n generalized  coordinates  plus  the  to  constraint  forces.  That  is,  n + m unknowns 
are  determined.  This  approach  is  discussed  in  chapter  5.9. 

3)  Generalized  forces:  This  approach  introduces  the  constraint  forces  explicity.  This  approach,  applied 
to  Lagrangian  mechanics,  is  discussed  in  chapter  6.6.3. 

The  above  three  approaches  exploit  generalized  coordinates  to  handle  constraint  forces  as  described  in 
chapter  6. 
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5.9  Lagrange  multipliers  for  holonomic  constraints 

5.9.1  Algebraic  equations  of  constraint 

The  Lagrange  multiplier  technique  provides  a powerful,  and  elegant,  way  to  handle  holonomic  constraints 
using  Euler’s  equations1.  The  general  method  of  Lagrange  multipliers  for  n variables,  with  m constraints, 
is  best  introduced  using  Bernoulli’s  ingenious  exploitation  of  virtual  infinitessimal  displacements,  which 
Lagrange  signified  by  the  symbol  S.  The  term  "virtual"  refers  to  an  intentional  variation  of  the  generalized 
coordinates  Sqi  in  order  to  elucidate  the  local  sensitivity  of  a function  F(qi,x)  to  variation  of  the  variable. 
Contrary  to  the  usual  infinitessimal  interval  in  differential  calculus,  where  an  actual  displacement  dqi  occurs 
during  a time  dt,  a virtual  displacement  is  imagined  to  be  an  instantaneous,  infinitessimal,  displacement  of 
a coordinate,  not  an  actual  displacement,  in  order  to  elucidate  the  local  dependence  of  F on  the  coordinate. 
The  local  dependence  of  any  functional  F,  to  virtual  displacements  of  all  n coordinates,  is  given  by  taking 
the  partial  differentials  of  F. 


sf  = £ ^sqi 

“ dqi 


(5.35) 


The  function  F is  stationary,  that  is  an  extremum,  if  equation  5.35  equals  zero.  The  extremum  of  the 
functional  F,  given  by  equation  5.16,  can  be  expressed  in  a compact  form  using  the  virtual  displacement 
formalism  as 


5F  = S 


rx 2 n n 

/ *52  f di(a 0; x]  dx  = Yl  u~' 5qi  = 0 

Jxi  , ,•  aQi 


(5.36) 


The  auxiliary  conditions,  due  to  the  m holonomic  algebraic  constraints  for  the  n variables  <r/;.  can  be 
expressed  by  the  m equations 

9k(  q)  = 0 (5.37) 

where  1 < k < m and  1 < i < n with  m < n.  The  variational  problem  for  the  m holonomic  constraint 
equations  also  can  be  written  in  terms  of  m differential  equations  where  1 < k < m 


Sgk 


n n 

£ )r  **  = o 

^ dqi 


i=  1 


(5.38) 


Since  equations  5.36  and  5.38  both  equal  zero,  the  m equations  5.38  can  be  multiplied  by  arbitrary 
undetermined  factors  A*,  and  added  to  equations  5.36  to  give. 


5F(qi,  x)  + XiSgi  + XoSg2  ■ -XkSgk  ■ -X mSgm  = 0 


(5.39) 


Note  that  this  is  not  trivial  in  that  although  the  sum  of  the  constraint  equations  for  each  yi  is  zero;  the 
individual  terms  of  the  sum  are  not  zero. 

Insert  equations  5.36  plus  5.38  into  5.39,  and  collect  all  n terms,  gives 


£ 


+YlXk 

*:= i 


Sqi  = 0 


(5.40) 


Note  that  all  the  Sq.\  are  free  independent  variations  and  thus  the  terms  in  the  brackets,  which  are  the 
coefficients  of  each  Sq.,  . individually  must  equal  zero.  For  each  of  the  n values  of  i,  the  corresponding  bracket 
implies 


d F 
dqi 


+ H Xk 


k= 1 


dgk 

dqi 


= 0 


(5.41) 


This  is  equivalent  to  what  would  be  obtained  from  the  variational  principle 


m 

SF  + J2  A kSgk  = 0 (5.42) 

k= 1 

1 This  textbook  uses  the  symbol  qi  to  designate  a generalized  coordinate,  and  to  designate  the  corresponding  first  derivative 
with  respect  to  the  independent  variable,  in  order  to  differentiate  the  spatial  coordinates  from  the  more  powerful  generalized 
coordinates. 
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Equation  5.42  is  equivalent  to  a variational  problem  for  finding  the  stationary  value  of  F' 

= 0 (5.43) 

where  F'  is  defined  to  be 

(5.44) 

The  solution  to  equation  5.43  can  be  found  using  Euler’s  differential  equation  5.19  of  variational  calculus. 
At  the  extremum  5 (F')  = 0 corresponds  to  following  contours  of  constant  F'  which  are  in  the  surface  that  is 
perpendicular  to  the  gradients  of  the  terms  in  F' . The  Lagrange  multiplier  constants  are  required  because, 
although  these  gradients  are  parallel  at  the  extremum,  the  magnitudes  of  the  gradients  are  not  equal. 

The  beauty  of  the  Lagrange  multipliers  approach  is  that  the  auxiliary  conditions  do  not  have  to  be 
handled  explicitly,  since  they  are  handled  automatically  as  m additional  free  variables  during  solution  of 
Euler’s  equations  for  a variational  problem  with  n + m unknowns  fit  to  n + m equations.  That  is,  the  n 
variables  Qi  are  determined  by  the  variational  procedure  using  the  n variational  equations 


i(f')  = ^+E 


dx  dq': 


"dqi 


dqi 


(5.45) 


simultaneously  with  the  m variables  Xk  which  are  determined  by  the  m variational  equations 


d OF'  d F' 

— ( ) - ( ) = 0 

dxyd  X'J  [dXkJ 


Equation  5.45  usually  is  expressed  as 


aF  aF  ™ dgu 
MqJ  dxMq’J^  ^Xk  Pin, 


dqi 


(5.46) 


(5.47) 


The  elegance  of  Lagrange  multipliers  is  that  a single  variational  approach  allows  simultaneous  determination 
of  all  n + m unknowns.  Chapter  6.2  will  show  that  the  forces  of  constraint  are  given  directly  by  the  Xkjpp- 
terrns. 


5.7  Example:  Two  dependent  variables  coupled  by  one  holonomic  constraint 


The  powerful,  and  generally  applicable,  Lagrange  multiplier  technique  is  illustrated  by  considering  the  case 
of  only  two  dependent  variables,  y(x),  and  z (x) , with  the  function  f(y(x),  y'(x),  z(x),z(x)'-,  x)  and  with  one 
holonomic  equation  of  constraint  coupling  these  two  dependent  variables.  The  extremum  is  given  by  requiring 


of  _ rX2  r/df  _ d_df_\  dy 
de  JXl  \dy  dxdy'J  de 


±d£\ 

dx  dz' ) 


dz~ 

We 


dx  = 0 


(A) 


with  the  constraint  expressed  by  the  auxiliary  condition 


g(y,z-,x)  = 0 (B) 

Note  that  the  variations  ||  and  are  no  longer  independent  because  of  the  constraint  equation,  thus  the 
the  two  terms  in  the  brackets  of  equation  A are  not  separately  equal  to  zero  at  the  extremum.  However, 
differentiating  the  constraint  equation  B gives 

dg  (dgdy  dg  dz\ 

i = Ui  + £iz)  = 0 (C) 

No  ||  term  applies  because,  for  the  independent  variable,  ^ = 0.  Introduce  the  neighboring  paths  by  adding 
the  auxiliary  functions 


y(e,x)  = y(x)  + eri1(x) 
z(e,x)  = z(x)  + eg2(x) 


(D) 

(E) 
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Insert  the  differentials  of  D and  E into  C gives 


dg  fdg  dg 

!;.  = Wh(x)  + a~Mx) 


implying  that 


Equation  A can  be  rewritten  as 

7xi  i\dy  dxdy'J 

r 

JXl  V 9y  dx  dy’ 


g2{x)\dx  = 


d_-df_ 
dx  dz' 


df  d df\%  , . , 
”i{x)dx 

' oz  _ 


Equation  G now  contains  only  a single  arbitrary  function  g1(x)  that  is  not  restricted  by  the  constraint.  Thus 
the  bracket  in  the  integrand  of  equation  G must  equal  zero  for  the  extremum.  That  is 


d_-df_ 
dx  dy' 


±df_ 

dx  dz' 


= — A(x) 


Now  the  left  hand  side  of  this  equation  is  only  a function  of  f and  g with  respect  to  y and  y'  while  the 
right-hand  side  is  a function  of  f and  g with  respect  to  z and  z' . Because  both  sides  are  functions  of  x then 
each  side  can  be  set  equal  to  a function  —A (a;).  Thus  the  above  equations  can  be  written  as 


d_df_ 
dx  dy' 


<V_XMdg 

dy  ‘ dy 


d_df_  _ df_  = dg 
dx  dz'  dz  ' [X>  dz 


There  are  three  unknown  functions.  y(x),z(x),  and  X(x).  The  complete  solution  for  these  three  unknown 
functions  is  obtained  by  solving  the  two  equations,  H,  plus  the  equation  of  constraint  F.  The  Lagrange 
multiplier  X(x)  is  related  to  the  force  of  constraint.  This  example  of  two  variables  coupled  by  one  holonomic 
constraint  conforms  with  the  general  relation  for  many  variables  and  constraints  given  by  equation  5.47. 


5.9.2  Integral  equations  of  constraint 

The  constraint  equation  also  can  be  given  in  an  integral  form  which  is  used  frequently  for  isoperimetric 
problems.  Consider  a one  dependent-variable  isoperimetric  problem,  where  it  is  required  to  find  the  curve 
q = q{x)  such  that  the  functional  has  an  extremum,  and  the  curve  q( x)  satisfies  boundary  conditions  such 
that  q( xf)  = a and  q(x 2)  = b.  That  is 


/•x2 

F(y)=  f(q,q’-,x)dx 
J Xi 


is  an  extremum  such  that  the  perimeter  also  is  a constraint  that  satisfies 

/X2 

g(q,q'-,x)dx  = l 


where  l is  a fixed  length.  This  is  an  integral  constraint. 

Analogous  to  (5.44)  these  two  functionals  can  be  combined  requiring  that 

6K(q,  x,  A)  = 5 [ F{q ) + XG{q)]  =6  [f  + Xg]dx  = 0 (5.50) 

J Xl 

That  is,  it  is  an  extremum  for  both  q(x)  and  the  Lagrange  multiplier  A.  This  effectively  involves  finding  the 
extremum  path  for  the  function  K(q,x,  A)  = F(q,x ) + A G(q,x)  where  both  q( x)  and  A are  the  minimized 
variables.  Therefore  the  curve  q(x)  must  satisfy  the  differential  equation 

d df  df  f d dg  dg  1 

d^Wi  ~dqi  +X[fadq’l~  Wi]  ~ 
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subject  to  the  boundary  conditions  q(x  1)  = a,  q(x 2)  = b,  and  G{q)  = l. 

5.8  Example:  Catenary 

One  isoperimetric  problem  is  the  catenary  which  is  the  shape  a uniform  rope  or  chain  of  fixed  length  l 
that  minimizes  the  gravitational  potential  energy.  Let  the  rope  have  a uniform  mass  per  unit  length  of  a 
kg/m. 

The  gravitational  potential  energy  is 


/*2 

U = ag  j yds  = ag  y\ / dx2  + dy 2 =ag  y\J  1 + y'2dx 


The  constraint  is  that  the  length  be  a constant  l 


r 2 f 2 

/ = J ds  = J y/l  + y'2dx 


Thus  the  function  is  f{y,y'\x)  = yy/l  + y'2  while  the  integral  con- 
straint sets  g = \/l  + y'2 

These  need  to  be  inserted  into  the  Euler  equation  (5.51)  by  defining 
F = f + \g  = (y  + A)  i/l  + y'2 
Note  that  this  case  is  one  where  = 0 and  X is  a constant;  also 

defining  z = y + A then  z'  = y' . Therefore  the  Euler’s  equations  can  be  written  in  the  integral  form 

„ ,dF 

b — z — - = c = constant 
oz' 

Inserting  the  relation  F = zy/1  + z'2  gives 


z\J  1 + z'2  — z' 


zz 


vT 


= c 


where  c is  an  arbitrary  constant.  This  simplifies  to 


z/2=  I - 


©- 


The  integral  of  this  is 

z = ccosh 

where  b and  c are  arbitrary  constants  fixed  by  the  locations  of  the  two  fixed  ends  of  the  rope. 

5.9  Example:  The  Queen  Dido  problem 

A famous  constrained  isoperimetric  legend  is  that  of  Dido , first  Queen  of  Carthage.  Legend  says  that, 
when  Dido  landed  in  North  Africa,  she  persuaded  the  local  chief  to  sell  her  as  much  land  as  an  oxhide  could 
contain.  She  cut  an  oxhide  into  narrow  strips  and  joined  them  to  make  a continuous  thread  more  than  four 
kilometers  in  length  which  was  sufficient  to  enclose  the  land  adjoining  the  coast  on  which  Carthage  was  built. 
Her  problem  was  to  enclose  the  maximum  area  for  a given  perimeter.  Let  us  assume  that  the  coast  line  is 
straight  and  the  ends  of  the  thread  are  at  ±a  on  the  coast  line.  The  enclosed  area  is  given  by 


r+a 


A = 


ydx 


The  constraint  equation  is  that  the  total  perimeter  equals  l. 


/a 

i/l  + y'2dx  = l 

-a 
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Thus  we  have  that  the  functional  f(y,y',x)  = y and  g(y,y',x)  = y 1 + y'2 . Then  |^  = 1,  ^-=0,  ||=0 
and  7 = jV  /0 • Insert  these  into  the  Euler- Lagrange  equation  (5.51)  gives 


That  is 


Integrate  with  respect  to  x gives 


dx  + y'  2 


± y'  =1 

dx  y7 1 + y'2  A 

V 

— , = x — 0 


V1  + v'2 

where  b is  a constant  of  integration.  This  can  be  rearranged  to  give 

±(x-b) 


The  integral  of  this  is 


!\2  - (x  - by 


y = TV  X2  - (x  - b)  + c 


Rearranging  this  gives 

(x  - b)2  + (y-  c)2  = A2 

This  is  the  equation  of  a circle  centered  at  ( b,c ).  Setting  the  bounds  to  be  (— a,  0)  to  (a,  0)  gives  that 
b = c = 0 and  the  circle  radius  is  A.  Thus  the  length  of  the  thread  must  be  l = tt\.  Assuming  that  l = 4 km 
then  A = 1.27 km  and  Queen  Dido  could  buy  an  area  of  2.53 km2. 

5.10  Geodesic 


The  geodesic  is  defined  as  the  shortest  path  between  two  fixed  points  for  motion  that  is  constrained  to  lie 
on  a surface.  Variational  calculus  provides  a powerful  approach  for  determining  the  equations  of  motion 
constrained  to  follow  a geodesic. 

The  use  of  variational  calculus  is  illustrated  by  considering  the  geodesic  constrained  to  follow  the  surface 
of  a sphere  of  radius  R.  As  discussed  in  appendix  C.2.3,  the  element  of  path  length  on  the  surface  of  the 

sphere  is  given  in  spherical  coordinates  as  ds  = Ry  d62  + (sin  9df>)2.  Therefore  the  distance  s between  two 
points  1 and  2 is 


s = R 


dOY  ■ 2 n ^ 

— + sir  9 df> 

d<f> ) 


The  function  / for  ensuring  that  s be  an  extremum  value  uses 

/ = ye'2  + sin2  9 (5.53) 

where  9'  = This  is  a case  where  ^ = 0 and  thus  the  integral  form  of  Euler’s  equation  can  be  used 
leading  to  the  result  that 

a 

y 9'2  + sin2  9 — 9'  —r  \/ 9' 2 + sin2  9 = constant  = a (5.54) 

o9 


This  gives  that 


This  can  be  rewritten  as 


sin2  9 = ay 9'2  + sin2  9 

d(j>  1 a esc2  9 

d9  9'  J\  — a2  esc2 1 
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Solving  for  </>  gives 

where 

That  is 


<j>  = sin 


T ot 


P = 


1 - 


cot  9 = P sin  (<j>  — a) 


Expanding  the  sine  and  cotangent  gives 

((3  cos  a)  R sin  9 sin  <j>  — (/3  sin  a)  R sin  9 cos  cj>  = R cos  9 
Since  the  brackets  are  constants,  this  can  be  written  as 

A ( R sin  9 sin  ft)  — B ( R sin  9 cos  4>)  = {R  cos  9) 

The  terms  in  the  brackets  are  just  expressions  for  the  rectangular  coordinates  x,  y,  z.  That  is, 


Ay  — Bx  = z 


(5.57) 

(5.58) 

(5.59) 

(5.60) 

(5.61) 

(5.62) 


This  is  the  equation  of  a plane  passing  through  the  center  of  the  sphere.  Thus  the  geodesic  on  a sphere 
is  the  path  where  that  plane  through  the  center,  as  well  as  the  initial  and  final  points,  intersects  the  sphere. 
This  geodesic  is  called  a great  circle.  Euler’s  equation  gives  both  the  maximum  and  minimum  extremum 
path  lengths  for  motion  on  this  great  circle. 

Chapter  16  discusses  the  geodesic  in  the  four-dimensional  space-time  coordinates  that  underlie  the  General 
Theory  of  Relativity.  As  a consequence,  the  use  of  the  calculus  of  variations  to  determine  the  equations  of 
motion  for  geodesics  plays  a pivotal  role  in  the  General  Theory  of  Relativity. 


5.11  Variational  approach  to  classical  mechanics 

This  chapter  has  introduced  the  general  principles  of  variational  calculus  needed  for  understanding  the  La- 
grangian  and  Hamiltonian  approaches  to  classical  mechanics.  Although  variational  calculus  was  developed 
originally  for  classical  mechanics,  now  it  has  grown  to  be  an  important  branch  of  mathematics  with  applica- 
tions to  many  other  fields  outside  of  physics.  The  prologue  of  this  book  emphasized  the  dramatic  differences 
between  the  differential  vectorial  approach  of  Newtonian  mechanics,  and  the  integral  variational  approaches 
of  Lagrange  and  Hamiltonian  mechanics.  The  Newtonian  vectorial  approach  involves  solving  Newton’s  dif- 
ferential equations  of  motion  that  relate  the  force  and  momenta  vectors.  This  requires  knowledge  of  the 
time  dependence  of  all  the  force  vectors,  including  constraint  forces,  acting  on  the  system  which  can  be  very 
complicated.  Chapter  2 showed  that  the  first-order  time  integrals,  equations  2.10,  2.16,  relate  the  initial  and 
final  total  momenta  without  requiring  knowledge  of  the  complicated  instantaneous  forces  acting  during  the 
collision  of  two  bodies.  Similarly,  for  conservative  systems,  the  first-order  spatial  integral,  equation  2.21, 
relates  the  initial  and  final  total  energies  to  the  net  work  done  on  the  system  without  requiring  knowledge 
of  the  instantaneous  force  vectors.  The  first-order  spatial  integral  has  the  advantage  that  it  is  a scalar  quan- 
tity, in  contrast  to  time  integrals  which  are  vector  quantities.  These  first-order  integral  relations  are  used 
frequently  in  Newtonian  mechanics  to  derive  solutions  of  the  equations  of  motion  that  avoid  having  to  solve 
complicated  differential  equations  of  motion. 

This  chapter  has  illustrated  that  variational  principles  provide  a means  of  deriving  more  detailed  infor- 
mation, such  as  the  trajectories  for  the  motion  between  given  initial  and  final  conditions,  by  requiring  that 
scalar  functionals  have  extrema  values.  For  example,  the  solution  of  the  brachistochrone  problem  determined 
the  trajectory  having  the  minimum  transit  time,  based  on  only  the  magnitudes  of  the  kinetic  and  gravita- 
tional potential  energies.  Similarly,  the  catenary  shape  of  a suspended  chain  was  derived  by  minimizing  the 
gravitational  potential  energy.  The  calculus  of  variations  uses  Euler’s  equations  to  determine  directly  the 
differential  equations  of  motion  of  the  system  that  lead  to  the  functional  of  interest  being  stationary  at  an 
extremum.  The  Lagrangian  and  Hamiltonian  variational  approaches  to  classical  mechanics  are  discussed 
in  chapters  6 — 16.  The  broad  range  of  applicability,  the  flexibility,  and  the  power  provided  by  variational 
approaches  to  classical  mechanics  and  modern  physics  will  be  illustrated. 
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5.12  Summary 


Euler’s  differential  equation:  The  calculus  of  variations  has  been  introduced  and  Euler’s  differential 
equation  was  derived.  The  calculus  of  variations  reduces  to  varying  the  functions  yi(x),  where  * = 1,2, 3,  ...n, 
such  that  the  integral 

rx  2 

F = f[yi(x),y'i(x);x]dx  (5.16) 

is  an  extremum,  that  is,  it  is  a maximum  or  minimum.  Here  x is  the  independent  variable,  yi(x)  are 
the  dependent  variables  plus  their  first  derivatives  y\  = -4L.  The  quantity  f [y(x),y'(x);x\  has  some  given 
dependence  on  y.;,  y\  and  x.  The  calculus  of  variations  involves  varying  the  functions  yi(x)  until  a stationary 
value  of  F is  found  which  is  presumed  to  be  an  extremum.  It  was  shown  that  if  the  yi(x)  are  independent, 
then  the  extremum  value  of  F leads  to  n independent  Euler  equations 


dl_c^df_ 

dyi  dx  dy[ 


(5.19) 


where  i = l,2,3..n.  This  can  be  used  to  determine  the  functional  form  yi(x)  that  ensures  that  the  integral 
F = f [y(x),  y'{x);  x]  dx  is  a stationary  value,  that  is,  presumably  a maximum  or  minimum  value. 

Note  that  Euler’s  equation  involves  partial  derivatives  for  the  dependent  variables  y,; , y' , and  the  total 
derivative  for  the  independent  variable  x. 

Euler’s  integral  equation:  It  was  shown  that  if  the  function  J)22  / [y, (x) , if  (x):/xi\  does  not  depend  on 
the  independent  variable,  then  Euler’s  differential  equation  can  be  written  in  an  integral  form.  This  integral 
form  of  Euler’s  equation  is  especially  useful  when  fy  = 0,  that  is,  when  f does  not  depend  explicitly  on  x, 
then  the  first  integral  of  the  Euler  equation  is  a constant 


, ,df 

f — y — — = constant 

dy' 


(5.25) 


Constrained  variational  systems:  Most  applications  involve  constraints  on  the  motion.  The  equations 
of  constraint  can  be  classified  according  to  whether  the  constraints  are  holonomic  or  non-holonomic,  the  time 
dependence  of  the  constraints,  and  whether  the  constraint  forces  are  conservative. 

Generalized  coordinates  in  variational  calculus:  Independent  generalized  coordinates  can  be  chosen 
that  are  perpendicular  to  the  rigid  constraint  forces  and  therefore  the  constraint  does  not  contribute  to  the 
functional  being  minimized.  That  is,  the  constraints  are  embedded  into  the  generalized  coordinates  and  thus 
the  constraints  can  be  ignored  when  deriving  the  variational  solution. 

Minimal  set  of  generalized  coordinates:  If  the  constraints  are  holonomic  then  the  m holonomic 
equations  of  constraint  can  be  used  to  transform  the  n coupled  generalized  coordinates  to  s = n — m 
independent  generalized  variables  cy , q[ . The  generalized  coordinate  method  then  uses  Euler’s  equations  to 
determine  these  s = n — m independent  generalized  coordinates. 


df_  _ d_df_ 
dqi  dx  dq[ 


(5.35) 


Lagrange  multipliers  for  holonomic  constraints:  The  Lagrange  multipliers  approach  for  n variables, 
plus  m holonomic  equations  of  constraint,  determines  all  N + m unknowns  for  the  system.  The  holonomic 
forces  of  constraint  acting  on  the  N variables,  are  related  to  the  Lagrange  multiplier  terms  A k(x)^p-  that 
are  introduced  into  the  Euler  equations.  That  is, 


df_ 

dyi 


dx  dy\ 


171 


where  the  holonomic  equations  of  constraint  are  given  by 


(5.48) 


9k(yi;x)  = o 


(5.38) 


The  advantage  of  using  the  Lagrange  multiplier  approach  is  that  the  variational  procedure  simultaneously 
determines  both  the  equations  of  motion  for  the  N variables  plus  the  m constraint  forces  acting  on  the 
system. 
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Workshop  exercises 

1.  Find  the  extremal  of  the  functional 

2 

J(x)  = J ^dt 

i 

that  satisfies  a;(l)  = 3 and  x(2)  = 18.  Show  that  this  extremal  provides  the  global  minimum  of  J. 

2.  Consider  the  use  of  equations  of  constraint. 

(a)  A particle  is  constrained  to  move  on  the  surface  of  a sphere.  What  are  the  equations  of  constraint  for  this 
system? 

(b)  A disk  of  mass  to  and  radius  R rolls  without  slipping  on  the  outside  surface  of  a half-cylinder  of  radius 
5 R.  What  are  the  equations  of  constraint  for  this  system? 

(c)  What  are  holonomic  constraints?  Which  of  the  equations  of  constraint  that  you  found  above  are  holo- 
nomic? 

(d)  Equations  of  constraint  that  do  not  explicitly  contain  time  are  said  to  be  scleronomic.  Moving  constraints 
are  rheonomic.  Are  the  equations  of  constraint  that  you  found  above  scleronomic  or  rheonomic? 

3.  For  each  of  the  following  systems,  describe  the  generalized  coordinates  that  would  work  best.  There  may  be 
more  than  one  answer  for  each  system. 

(a)  An  inclined  plane  of  mass  M is  sliding  on  a smooth  horizontal  surface,  while  a particle  of  mass  m is 
sliding  on  the  smooth  inclined  surface. 

(b)  A disk  rolls  without  slipping  across  a horizontal  plane.  The  plane  of  the  disk  remains  vertical,  but  it  is 
free  to  rotate  about  a vertical  axis. 

(c)  A double  pendulum  consisting  of  two  simple  pendula,  with  one  pendulum  suspended  from  the  bob  of  the 
other.  The  two  pendula  have  equal  lengths  and  have  bobs  of  equal  mass.  Both  pendula  are  confined  to 
move  in  the  same  plane. 

(d)  A particle  of  mass  m is  constrained  to  move  on  a circle  of  radius  R.  The  circle  rotates  in  space  about 
one  point  on  the  circle,  which  is  fixed.  The  rotation  takes  place  in  the  plane  of  the  circle,  with  constant 
angular  speed  w,  in  the  absence  of  a gravitational  force. 

(e)  A particle  of  mass  to  is  attracted  toward  a given  point  by  a force  of  magnitude  fc/r2,  where  k is  a constant. 

4.  Looking  back  at  the  systems  in  problem  3,  which  ones  could  have  equations  of  constraint?  How  would  you 
classify  the  equations  of  constraint  (holonomic,  scleronomic,  rheonomic,  etc.)? 
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Problems 


1.  Find  the  extremal  of  the  functional 


/»7T 

J{x)  = / (2xsinf  — x2)dt 

Jo 


that  satisfies  x(o)  = x(n)  = 0.  Show  that  this  extremal  provides  the  global  maximum  of  J. 


2.  Find  and  describe  the  path  y = y(x)  for  which  the  the  integral  j^-iy'Ur  is  stationary. 


3.  Find  the  dimensions  of  the  parallelepiped  of  maximum  volume  circumscribed  by  a sphere  of  radius  R. 

4.  Consider  a single  loop  of  the  cycloid  having  a fixed  value  of  a as  shown  in  the  figure.  A car  released  from 
rest  at  any  point  Pq  anywhere  on  the  track  between  O and  the  lowest  point  P , that  is,  Pq  has  a parameter 
0 < 6q  < 7T. 


(a)  Show  that  the  time  T for  the  cart  to  slide  from  Pq  to  P is  given  by  the  integral 

T(Pq  —>  P)  = J-  I W \ ~~~  „dS 


1 — cos  6 
g J V cos  0q  — cos  9 


(b)  Prove  that  this  time  T is  equal  to  irW a/ g which  is  independent  of  the  position  Pq. 

(c)  Explain  qualitatively  how  this  surprising  result  can  possibly  be  true. 


5.  Consider  a medium  for  which  the  refractive  index  n = ^ where  a is  a constant  and  r is  the  distance  from 
the  origin.  Use  Fermat’s  Principle  to  find  the  path  of  a ray  of  light  travelling  in  a plane  containing  the  origin. 
Hint,  use  two-dimensional  polar  coordinates  with  (f>  = (f>(r) . Show  that  the  resulting  path  is  a circle  through 
the  origin. 

6.  Find  the  shortest  path  between  the  ( x , y,  z ) points  (0,  —1,  0)  and  (0, 1,  0)  on  the  conical  surface 

z — 1 — \J  x2  + y2 

What  is  the  length  of  this  path?  Note  that  this  is  the  shortest  mountain  path  around  a volcano. 

7.  Show  that  the  geodesic  on  the  surface  of  a right  circular  cylinder  is  a segment  of  a helix. 


Chapter  6 

Lagrangian  dynamics 


6.1  Introduction 


Newtonian  mechanics  is  based  on  vector  observables  such  as  momentum  and  force,  and  Newton’s  equations 
of  motion  can  be  derived  if  the  forces  are  known.  However,  Newtonian  mechanics  becomes  difficult  for 
many-body  systems  when  constraint  forces  apply.  The  alternative  algebraic  Lagrangian  mechanics  approach 
is  based  on  the  concept  of  scalar  energies  which  circumvent  many  of  the  difficulties  in  handling  constraint 
forces  and  many-body  systems. 

The  Lagrangian  approach  to  classical  dynamics  is  based  on  the  calculus  of  variations  introduced  in  chapter 
5.  It  was  shown  that  the  calculus  of  variations  determines  the  function  yi(x)  such  that  the  scalar  functional 

rx 2 N 

F=  ^jf[yi{x),y'i(x)\x}dx  (6.1) 

Jx  1 ; 


is  an  extremum,  that  is,  a maximum  or  minimum.  Here  x is  the  independent  variable,  yi(x)  are  the  n 
dependent  variables,  and  their  derivatives  y[  = -P-,  where  i = 1,2,3,  ..n.  The  function  / [yi(x),  y'(a:);  x]  has 
an  assumed  dependence  on  y;, . ?/'  and  x.  The  calculus  of  variations  determines  the  functional  dependence 
of  the  dependent  variables  y.;( x),  on  the  independent  variable  x,  that  is  required  to  ensure  that  F is  an 
extremum.  For  n independent  variables,  F has  a stationary  point,  which  is  presumed  to  be  an  extremum, 
that  is  determined  by  solution  of  Euler’s  differential  equations 


±dl_dl 

dx  dy'  dyi 


(6.2) 


If  the  coordinates  yi(x)  are  independent,  then  the  Euler  equations,  (6.2),  for  each  coordinate  i are  inde- 
pendent. However,  for  constrained  motion,  the  constraints  lead  to  auxiliary  conditions  that  correlate  the 
coordinates.  As  shown  in  chapter  5,  a transformation  to  independent  generalized  coordinates  can  be  made 
such  that  the  correlations  induced  by  the  constraint  forces  are  embedded  into  the  choice  of  the  independent 
generalized  coordinates.  The  use  of  generalized  coordinates  in  Lagrangian  mechanics  simplifies  derivation  of 
the  equations  of  motion  for  constrained  systems.  For  example,  for  a system  of  n coordinates,  that  involves 
m holonomic  constraints,  there  are  s = n — m independent  generalized  coordinates.  For  such  holonomic 
constrained  motion,  it  will  be  shown  that  the  Euler  equations  can  be  solved  using  either  of  the  following 
three  alternative  ways. 

1)  The  minimal  set  of  generalized  coordinates  approach  involves  finding  a set  of  s = n — m indepen- 
dent generalized  coordinates  q.i  that  satisfy  the  assumptions  underlying  (6.3).  These  generalized  coordinates 
can  be  determined  if  the  m equations  of  constraint  are  holonomic,  that  is,  related  by  algebraic  equations  of 
constraint 

9k(qf,x)  = 0 (6.3) 

where  k = 1,2,  3,  ....m.  These  equations  uniquely  determine  the  relationship  between  the  n correlated  coordi- 
nates. This  method  has  the  advantage  that  it  reduces  the  system  of  n coordinates,  subject  to  m constraints, 
to  s = n—m  independent  generalized  coordinates  which  reduces  the  dimension  of  the  problem  to  be  solved. 
However,  it  does  not  explicitly  determine  the  forces  of  constraint  which  are  effectively  swept  under  the  rug. 
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2)  The  Lagrange  multipliers  approach  takes  account  of  the  correlation  between  the  n coordinates  and 
to  holonomic  constraints  by  introducing  the  Lagrange  multipliers  A k{x).  These  n generalized  coordinates  q,: 
are  correlated  by  the  to  holonomic  constraints. 


d df  df  ^ , . . dgk 

dx  dq[  dqi  dqi 

k 


(6.4) 


where  * = 1,2, 3,  ...n.  The  Lagrange  multiplier  approach  has  the  advantage  that  Euler’s  calculus  of  variations 
automatically  use  the  n Lagrange  equations,  plus  the  to  equations  of  constraint,  to  explicitly  determine  both 
the  n coordinates  qt  and  the  to  forces  of  constraint  which  are  related  to  the  Lagrange  multipliers  Xk  as  given 
in  equation  (6.4).  Chapter  6.2  shows  that  the  A*,  (x)  ^ terms  are  directly  related  to  the  holonomic 
forces  of  constraint. 

3)  The  generalized  force  approach  incorporates  the  forces  of  constraint  explicitly  as  will  be  shown 
in  chapter  6.5.3.  Generalized  forces  include  the  constraint  forces  explicitly,  and  thus  can  accommodate 
holonomic,  non-holonomic,  and  non-conservative  forces. 

The  physics  underlying  the  Lagrange  formulation  of  classical  mechanics  will  be  illustrated  by  use  of  a 
plausibility  argument  that  is  based  on  Newton’s  laws  of  motion.  This  will  be  followed  by  a more  rigorous 
derivation  of  the  Lagrangian  formulation  developed  by  the  following  two  approaches  that  better  elucidate 
the  physics  underlying  the  Lagrange  and  Hamiltonian  analytic  representations  of  classical  mechanics.  In 
1788  Lagrange  derived  his  equations  of  motion  using  the  differential  d’Alembert  Principle,  that  extends  to 
dynamical  systems  the  Bernoulli  Principle  of  infinitessimal  virtual  displacements  and  virtual  work.  The 
other  approach,  developed  in  1834,  uses  the  integral  Hamilton’s  Principle  to  derive  the  Lagrange  equations. 
Euler’s  variational  calculus  underlies  d’Alembert’s  Principle  and  Hamilton’s  Principle  since  both  are  based 
on  the  philosophical  belief  that  the  laws  of  nature  prefer  economy  of  motion.  Chapters  6.2  — 6.5  show  that 
both  d’Alembert’s  Principle  and  Hamilton’s  Principle  lead  to  the  Euler-Lagrange  equations.  This  will  be 
followed  by  examples  to  illustrate  the  use  of  Lagrangian  mechanics  in  classical  mechanics. 


6.2  Newtonian  plausibility  argument  for  Lagrangian  mechanics 


Insight  into  the  physics  underlying  Lagrange  mechanics  is  given  by  showing  the  direct  relationship  between 
Newtonian  and  Lagrangian  mechanics.  The  variational  approaches  to  classical  mechanics  exploit  the  first- 
order  spatial  integral  of  the  force,  equation  2.17,  which  equals  the  work  done  between  the  initial  and  final 
conditions.  This  is  a simple  scalar  quantity  that  depends  on  the  initial  and  final  location  for  conservative 
forces.  Newton’s  equation  of  motion  is 


(6.5) 


The  kinetic  energy  is  given  by 


T = w = £-£  = 

2 2 to  2 to 


Ev_ 

2 TO 


Pt_ 
2 TO 


It  can  be  seen  that 


and 


dT 

dx 


= Px 


d dT  dpx 
dt  dx  dt 


(6.6) 

(6.7) 


Consider  that  the  force,  acting  on  a mass  to,  is  arbitrarily  separated  into  two  components,  one  part  that 
is  conservative,  and  thus  can  be  written  as  the  gradient  of  a scalar  potential  U,  plus  the  excluded  part  of 
the  force,  FEX . The  excluded  part  of  the  force  FEX  could  include  non-conservative  frictional  forces  as  well 
as  forces  of  constraint  which  may  be  conservative  or  non-conservative.  This  separation  allows  the  force  to 
be  written  as 


F = -VU  + Fex 


(6.8) 


d dT  dU  rnPY 

= 1-  Fex 

dt  dii  dxi  Xz 


Along  each  of  the  x,  axes, 


(6.9) 


6.2.  NEWTONIAN  PLAUSIBILITY  ARGUMENT  FOR  LAGRANGIAN  MECHANICS 
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Equation  (6.9)  can  be  extended  by  transforming  the  cartesian  coordinate  xt  to  the  generalized  coordinates 
Qi- 

Define  the  standard  Lagrangian  to  be  the  difference  between  the  kinetic  energy  and  the  potential  energy, 
which  can  be  written  in  terms  of  the  generalized  coordinates  qi  as 


L{qiAi)  = T(qi)  - U{qt) 


(6.10) 


Assume  that  the  potential  is  only  a function  of  the  generalized  coordinates  Qi , that  is  = 0,  then 


dL  _c YT  8U  _ dT 
dqi  dqi  dqi  dqi 


(6.11) 


Using  the  above  equations  allows  Newton’s  equation  of  motion  (6.9)  to  be  expressed  as 

±BL_dL  = 
dt  dqi  dqi  qi 


(6.12) 


The  excluded  force  FEX  can  be  partitioned  into  a holonomic  constraint  force  FEC , plus  any  remaining 
excluded  forces  FEXC , as  given  by 


Fex=fhc  + fexc  (6.13) 

A comparison  of  equations  (6.12)  and  (6.4)  shows  that  the  holonomic  constraint  forces  FEC , that  are 
contained  in  the  excluded  force  FEX , can  be  identified  with  the  Lagrange  multiplier  term  in  equation  6.4. 

m p. 

= (6-14) 

That  is  the  Lagrange  multiplier  terms  can  be  used  to  account  for  holonomic  constraint  forces  FEC . Thus 
equation  6.12  can  be  written  as 


d_  dL_  _ dL 
dt  dqi  dqi 


y 


\ (A  _L  J?EXC 

Xk{t)d^+Fq> 


(6.15) 


where  the  Lagrange  multiplier  term  accounts  for  holonomic  constraint  forces,  and  FEXC  includes  all  the 
remaining  forces  that  are  not  accounted  for  by  the  scalar  potential  U,  or  the  Lagrange  multiplier  terms  FEC . 

For  holonomic,  conservative  forces  it  is  possible  to  absorb  all  the  forces  into  the  potential  U plus  the 
Lagrange  multiplier  term,  that  is  FEXC  = 0.  Moreover,  the  use  of  a minimal  set  of  generalized  coordinates 
allows  the  holonomic  constraint  forces  to  be  ignored  by  explicitly  reducing  the  number  of  coordinates  from 
n dependent  coordinates  to  s = n — m independent  generalized  coordinates.  That  is,  the  correlations  due 
to  the  constraint  forces  are  embedded  into  the  generalized  coordinates.  Then  equation  6.15  reduces  to  the 
basic  Euler  differential  equations. 


d dL  dL 
dt  dqi  dqi 


(6.16) 


Note  that  equation  6.16  is  identical  to  Euler’s  equation  5.34,  if  the  independent  variable  x is  replaced 
by  time  t.  Thus  Newton’s  equation  of  motion  are  equivalent  to  minimizing  the  action  integral  S = J ^ Ldt , 
that  is 


5S  = 5 L(q.i,  qf,  t)dt  = 0 


(6.17) 


which  is  Hamilton’s  Principle.  Hamilton’s  Principle  underlies  many  aspects  of  physics  and  now  it  is  used 
as  the  starting  point  for  developing  classical  mechanics.  Hamilton’  Principle  was  postulated  46  years  after 
Lagrange  introduced  Lagrangian  mechanics. 

The  above  plausibility  argument,  which  is  based  on  Newtonian  mechanics,  illustrates  the  close  connection 
between  the  vectorial  Newtonian  mechanics  and  the  algebraic  Lagrangian  mechanics  approaches  to  classical 
mechanics. 
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6.3  Lagrange  equations  from  d’Alembert’s  Principle 

6.3.1  d’Alembert’s  Principle  of  virtual  work 

The  Principle  of  Virtual  Work  provides  a basis  for  a rigorous  derivation  of  Lagrangian  mechanics.  Bernoulli 
introduced  the  concept  of  virtual  infinitessimal  displacement  of  a system  mentioned  in  chapter  5.9.1.  This 
refers  to  a change  in  the  configuration  of  the  system  as  a result  of  any  arbitrary  infinitessimal  instantaneous 
change  of  the  coordinates  fir( . that  is  consistent  with  the  forces  and  constraints  imposed  on  the  system  at 
the  instant  t.  Lagrange’s  symbol  <5  is  used  to  designate  a virtual  displacement  which  is  called  "virtual"  to 
imply  that  there  is  no  change  in  time  t,  i.e.  5t  = 0.  This  distinguishes  it  from  an  actual  displacement  fir,  of 
body  i during  a time  interval  dt  when  the  forces  and  constraints  may  change. 

Suppose  that  the  system  of  n particles  is  in  equilibrium,  that  is,  the  total  force  on  each  particle  i is 
zero.  The  virtual  work  done  by  the  force  Fi  moving  a distance  fir,-  is  given  by  the  dot  product  Fi  • fir,.  For 
equilibrium,  the  sum  of  all  these  products  for  the  N bodies  also  must  be  zero 

N 

Y F-t  • fir,  = 0 (6.18) 

i 

Decomposing  the  force  F*  on  particle  i into  applied  forces  Ff  and  constraint  forces  ff  gives 

N N 

YF?  'firi  + E^  -fir<=0  (6-19) 

i i 

The  second  term  in  equation  6.19  can  be  ignored  if  the  virtual  work  due  to  the  constraint  forces  is  zero. 
This  is  rigorously  true  for  rigid  bodies  and  is  valid  for  any  forces  of  constraint  where  the  constraint  forces 
are  perpendicular  to  the  constraint  surface  and  the  virtual  displacement  is  tangent  to  this  surface.  Thus  if 
the  constraint  forces  do  no  work,  then  (6.19)  reduces  to 

N 

Y??-Srt=0  (6.20) 


This  relation  is  the  Bernoulli’s  Principle  of  Static  Virtual  Work  and  is  used  to  solve  problems  in  statics. 
Bernoulli  introduced  dynamics  by  using  Newton’s  Law  to  related  force  and  momentum. 

F,  = p,  (6.21) 

Equation  (6.21)  can  be  rewritten  as 

Fi  - Pi  = 0 (6.22) 

In  1742,  d’Alembert  developed  the  Principle  of  Dynamic  Virtual  Work  in  the  form 

N 

X)(Fi  - Pi)  • fir*  = 0 (6.23) 


Using  equations  (6.19)  plus  (6.23)  gives 

N N 

^(Ff-p,)-fir,  + ^ff -fir,  = 0 (6.24) 

i i 

For  the  special  case  where  the  forces  of  constraint  is  zero,  then  equation  6.24  reduces  to  d’Alembert’s 
Principle 

N 

Y(F?  - Pi)  • fir,  = 0 (6.25) 

i 

The  d’Alembert’s  Principle,  by  a stroke  of  genius,  cleverly  transforms  the  principle  of  virtual  work  from  the 
realm  of  statics  to  dynamics.  Application  of  virtual  work  to  statics  primarily  leads  to  algebraic  equations 
between  the  forces,  whereas  d’Alembert’s  principle  applied  to  dynamics  leads  to  differential  equations. 
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6.3.2  Transformation  to  generalized  coordinates 

In  classical  mechanical  systems  the  coordinates  <5r*  usually  are  not  independent  due  to  the  forces  of  constraint 
and  the  constraint  force  energy  contributes  to  equation  6.24.  These  problems  can  be  eliminated  by  expressing 
d’Alembert’s  Principle  in  terms  of  virtual  displacements  of  n independent  generalized  coordinates  qt  of  the 
system  for  which  the  constraint  force  term  ^ -l  f f ■ S q;  = 0.  Then  the  individual  variational  coefficients  Sqi 
are  independent  and  (Ff  — p;)  • dq.;  = 0 can  be  equated  to  zero  for  each  value  of  i. 

The  transformation  of  the  TV-body  system  to  n independent  generalized  coordinates  qu  can  be  expressed 
as 

=ri(q1,q2,q3...,qn,t)  (6.26) 

Assuming  n independent  coordinates,  then  the  velocity  can  be  written  in  terms  of  general  coordinates  qu 
using  the  chain  rule  for  partial  differentiation. 


Vi 


drj  _ y-r  dvj  . dr ■» 
dt  ~ ^ dq,  + dt 

3 


(6.27) 


The  arbitrary  virtual  displacement  5ri  can  be  related  to  the  virtual  displacement  of  the  generalized  coordinate 
5q3  by 

= E (6-28) 

3 3 

Note  that  by  definition,  a virtual  displacement  considers  only  displacements  of  the  coordinates,  and  no  time 
variation  6t  is  involved. 

The  above  transformations  can  be  used  to  express  d’Alembert’s  dynamical  principle  of  virtual  work  in 
generalized  coordinates.  Thus  the  first  term  in  d’Alembert’s  Dynamical  Principle,  (6.25)  becomes 


n n r\  n 

E E • = E E-  = E w*  (6-29) 


where  Qj  are  called  components  of  the  generalized  force,1  defined  as 


dvj 

dqj 


(6.30) 


Note  that  just  as  the  generalized  coordinates  qj  need  not  have  the  dimensions  of  length,  so  the  Qj  do  not 
necessarily  have  the  dimensions  of  force,  but  the  product  QjSqj  must  have  the  dimensions  of  work.  For 
example,  Qj  could  be  torque  and  Sqj  could  be  the  corresponding  infinitessimal  rotation  angle. 

The  second  term  in  d’Alembert’s  Principle  (6.25)  can  be  transformed  using  equation  6.28 


dr. 


E Pi  ' ^ rriiTj  • dr,;  = ( ^ m.irl  ■ J 6q 


(6.31) 


The  right-hand  side  of  (6.31)  can  be  rewritten  as 


( v-  ..  9r,:  \ ST  f d f • dr  A • d ( dr, 

(2>ri  • W3) 6qj = ? { * riri  mri  -dt{d^-^6q- 

Note  that  equation  (6.27)  gives  that 

dvi  _ dr* 
dqj  dqj 

therefore  the  first  right-hand  term  in  (6.32)  can  be  written  as 

d ( . dr  A d f dvi 

dt  VTOiri  ' dqj  ) ~ dt  VTOiVi  ' dqj 


(6.32) 


(6.33) 


(6.34) 


1 This  proof,  plus  the  notation,  conform  with  that  used  by  Goldstein  [Go50]  and  by  other  texts  on  classical  mechanics. 
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The  second  right-hand  term  in  (6.32)  can  be  rewritten  by  interchanging  the  order  of  the  differentiation  with 
respect  to  t and  qj 


d_  f drA  _ 9v» 

dt  \dqj  ) dqj 


(6.35) 


Substituting  (6.34)  and  (6.35)  into  (6.32)  gives 


± P, ' «r,  = (s  mi,  ' ^ ««i  = £ { ± (m.v,  • |jj)  - "XV,  ■ g ) sq, 
Inserting  (6.29)  and  (6.36)  into  d’Alembert’s  Principle  (6.25)  leads  to  the  relation 

n N ( , 


1 


9 1 


dt\dq3  \^2m^ 


Qj  > Sq-j  = 0 


(6.36) 


(6.37) 


The  \mivi  term  can  be  identified  with  the  system  kinetic  energy  T.  Thus  d’Alembert  Principle  reduces 
to  the  relation 


E 


Qj 


Sqj  = 0 


(6.38) 


For  cartesian  coordinates  T is  a function  only  of  velocities  (x,  y,  z)  and  thus  the  term  ^ = 0.  However, 
as  discussed  in  appendix  C. 2.2,  for  curvilinear  coordinates  ^ 0 due  to  the  curvature  of  the  coordinates 

as  is  illustrated  for  polar  coordinates  where  v =rr  + rdd. 

If  all  the  n generalized  coordinates  qj  are  independent , then  equation  6.38  implies  that  the  term  in  the 
square  brackets  is  zero  for  each  individual  value  of  j.  This  leads  to  the  basic  Euler-Lagrange  equations  of 
motion  for  each  of  the  independent  generalized  coordinates 


I- 

\ dt 


(6.39) 


where  n > j > 1.  That  is,  this  leads  to  n Euler-Lagrange  equations  of  motion  for  the  generalized  forces  Qj. 
As  discussed  in  chapter  5.8,  when  m holonomic  constraint  forces  apply,  it  is  possible  to  reduce  the  system 
to  s = n — m independent  generalized  coordinates  for  which  equation  6.25  applies. 

In  1687  Leibniz  proposed  minimizing  the  time  integral  of  his  “vis  viva",  which  equals  2 T.  That  is, 


rt2 

5 Tdt  = 0 
Jti 


(6.40) 


The  variational  equation  6.39  accomplishes  the  minimization  of  equation  6.40.  It  is  remarkable  that  Leibniz 
anticipated  the  basic  variational  concept  prior  to  the  birth  of  the  developers  of  Lagrangian  mechanics,  i.e., 
d’Alembert,  Euler,  Lagrange,  and  Hamilton. 


6.3.3  Lagrangian 


The  handling  of  both  conservative  and  non-conservative  generalized  forces  Qj  is  best  achieved  by  assuming 
that  the  generalized  force  Qj  = F;A  can  be  partitioned  into  a conservative  velocity-independent  term, 
that  can  be  expressed  in  terms  of  the  gradient  of  a scalar  potential,  —VI/,,  plus  an  excluded  generalized  force 
Qfx  which  contains  the  non-conservative,  velocity-dependent,  and  all  the  constraint  forces  not  explicitly 
included  in  the  potential  Uj.  That  is, 

Q0  = -VUj  + Qfx  (6.41) 


Inserting  (6.41)  into  (6.38) , and  assuming  that  the  potential  U is  velocity  independent , allows  (6.38)  to  be 
rewritten  as 

d fd(T-U)\  d(T-U)\_QEX] 


3 


dt 


dm 


dqj 


Sqj  = 0 


(6.42) 
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The  standard  definition  of  the  Lagrangian  is 


L = T-U 


(6.43) 


then  (6.42)  can  be  written  as 


E 


-Qfx 


Sqj  = 0 


(6.44) 


Note  that  equation  (6.44)  contains  the  basic  Euler-Lagrange  equation  (6.38)  as  a special  case  when  U = 0. 
In  addition,  note  that  if  all  the  generalized  coordinates  are  independent,  then  the  square  bracket  terms  are 
zero  for  each  value  of  j,  which  leads  to  the  general  Euler-Lagrange  equations  of  motion 


i- 

\ dt 


Q 


,E  X 
3 


(6.45) 


where  n > j > 1. 

Chapter  6.5.3  will  show  that  the  holonomic  constraint  forces  can  be  factored  out  of  the  generalized  force 
term  Qfx  which  simplifies  derivation  of  the  equations  of  motion  using  Lagrangian  mechanics.  The  general 
Euler-Lagrange  equations  of  motion  are  used  extensively  in  classical  mechanics  because  conservative  forces 
play  a ubiquitous  role  in  classical  mechanics. 


6.4  Lagrange  equations  from  Hamilton’s  Principle 

In  two  papers  published  in  1834  and  1835,  Hamilton  announced  a dynamical  principle  upon  which  it  is 
possible  to  base  all  of  classical  mechanics,  and  much  of  classical  physics.  Hamilton  was  seeking  a theory  of 
optics  when  he  developed  Hamilton’s  Principle  and  the  field  of  Hamiltonian  mechanics  both  of  which  play 
a crucial  role  in  classical  mechanics  and  modern  physics.  Hamilton’s  Principle  states.  “ dynamical  systems 
follow  paths  that  minimize  the  time  integral  of  the  Lagrangian”.  That  is,  the  action  functional  S 


rt  2 

S = / L(q,  q,t)dt 

Jt i 


(6.46) 


has  a minimum  value  for  the  correct  path  of  motion.  As  discussed  in  chapter  13.2,  choice  the  Lagrangian 
usually  is  limited  to  a function  of  the  generalized  coordinates  q,  and  their  velocities  q,  plus  time  t.  At  this 
stage  the  discussion  is  restricted  to  use  of  the  standard  Lagrangian  L = T — U.  Hamilton’s  Principle  can 
be  written  in  terms  of  virtual  infinitessimal  displacement  5,  as 


SS  = 5 


Ldt  = 0 


(6.47) 


Variational  calculus  therefore  implies  that  a system  of  s independent  generalized  coordinates  must  satisfy 
the  basic  Lagrange-Euler  equations 

This  is  precisely  the  conclusion  given  in  equation  6.45  when  Qfx  = 0 which  was  derived  using  d’Alembert’s 
Principle. 

This  discussion  has  demonstrated  that  Euler’s  variational  differential  equation  underlies  both  the  dif- 
ferential variational  d’Alembert  Principle,  and  the  integral  Hamilton’s  Principle.  These  approaches  have 
been  used  to  derive  the  most  general  Lagrange  equations  that  are  applicable  to  both  holonomic  and  non- 
liolonomic  constraints,  as  well  as  for  conservative  and  non-conservative  systems.  Chapter  6.2  presented  a 
plausibility  argument  that  illustrated  that  the  same  result  is  justified  based  on  Newtonian  mechanics.  How- 
ever, d’Alembert’s  Principle  and  Hamilton’s  Principle,  expressed  in  terms  of  generalized  coordinates,  are 
broader  in  scope  than  the  equations  of  motion  implied  using  Newtonian  mechanics. 
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6.5  Constrained  systems 

The  motion  for  systems  subject  to  constraints  is  difficult  to  calculate  using  Newtonian  mechanics  because 
all  the  unknown  constraint  forces  must  be  included  explicitly  with  the  active  forces  in  order  to  determine 
the  equations  of  motion.  Lagrangian  mechanics  avoids  these  difficulties  by  allowing  selection  of  independent 
generalized  coordinates  that  incorporate  the  correlated  motion  induced  by  the  constraint  forces.  This  allows 
the  constraint  forces  acting  on  the  system  to  be  ignored  by  reducing  the  system  to  a minimal  set  of  generalized 
coordinates.  The  holonomic  constraint  forces  can  be  determined  using  the  Lagrange  multiplier  approach, 
and  all  constraint  forces  can  be  determined  by  including  them  as  generalized  forces,  as  described  below. 

6.5.1  Choice  of  generalized  coordinates 

As  discussed  in  chapter  5.8,  the  flexibility  and  freedom  for  selection  of  generalized  coordinates  is  a consid- 
erable advantage  of  Lagrangian  mechanics  when  handling  constrained  systems.  The  generalized  coordinates 
can  be  any  set  of  independent  variables  that  completely  specify  the  scalar  action  functional,  equation  6.46. 
The  generalized  coordinates  are  not  required  to  be  orthogonal  as  is  required  using  the  vectorial  Newtonian 
approach.  The  secret  to  using  generalized  coordinates  is  to  select  coordinates  that  are  perpendicular  to  the 
constraint  forces  so  that  the  constraint  forces  do  no  work.  Moreover,  if  the  constraints  are  rigid,  then  the 
constraint  forces  do  no  work  in  the  direction  of  the  constraint  force.  As  a consequence,  the  constraint  forces 
do  not  contribute  to  the  action  integral  and  thus  the  ff  ■ Sr.i  term  in  equation  6.19  can  be  omitted  from 
the  action  integral.  Generalized  coordinates  allow  reducing  the  number  of  unknowns  from  n to  s = n — m 
when  the  system  has  m holonomic  constraints.  In  addition,  generalized  coordinates  facilitate  using  both  the 
Lagrange  multipliers,  and  the  generalized  forces,  approaches  for  determining  the  constraint  forces. 


6.5.2  Minimal  set  of  generalized  coordinates 

The  set  of  n generalized  coordinates  (p  are  used  to  describe  the  motion  of  the  system.  No  restrictions  have 
been  placed  on  the  nature  of  the  constraints  other  than  they  are  workless  for  a virtual  displacement.  If  the 
m constraints  are  holonomic,  then  it  is  possible  to  find  sets  of  s = n — m independent  generalized  coordinates 
Qj  that  contain  the  m constraint  conditions  implicitly  in  the  transformation  equations 

ri=ri(quq2,q3...,qa,t)  (6.49) 


For  the  case  of  s = n — m unknowns,  any  virtual  displacement  Sqj  is  independent  of  Sqk , therefore  the 
only  way  for  (6.44)  to  hold  is  for  the  term  in  brackets  to  vanish  for  each  value  of  j,  that  is 


u 

\ dt 


= Qfx 


(6.50) 


where  j = 1,  2,  3, ..  s.  These  are  the  Lagrange  equations  for  the  minimal  set  of  s independent  generalized 
coordinates. 

If  all  the  generalized  forces  are  conservative  plus  velocity  independent,  and  are  included  in  the  potential 
U,  and  Qfx  = 0,  then  (6.50)  simplifies  to 


i- 

\ dt 


(6.51) 


This  is  Euler’s  differential  equation,  derived  earlier  using  the  calculus  of  variations.  Thus  d’Alembert’s 
Principle  leads  to  a solution  that  minimizes  the  action  integral  S J)*2  Ldt  = 0 as  stated  by  Hamilton’s 
Principle. 


6.5.3  Lagrange  multipliers  approach 

Equation  (6.44)  sums  over  all  n coordinates  for  N particles,  providing  n equations  of  motion.  If  the  m 
constraints  are  holonomic  they  can  be  expressed  by  m algebraic  equations  of  constraint 


gk(qi,q2,~qn,t)  = 0 


(6.52) 
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where  k = 1,2,3,  ...m.  Kinematic  constraints  can  be  expressed  in  terms  of  the  infinitessimal  displacements 
of  the  form 

= 0 (6-53) 

where  k = 1, 2, 3,  ...m,  j = 1, 2, 3,  ...n,  and  where  the  and  are  functions  of  the  generalized  coordinates 
Qj,  described  by  the  vector  q,  that  are  derived  from  the  equations  of  constraint.  As  discussed  in  chapter  5.7, 
if  (6.53)  represents  the  total  differential  of  a function,  then  it  can  be  integrated  to  give  a holonomic  relation 
of  the  form  of  equation  (6.52).  However,  if  (6.53)  is  not  the  total  differential,  then  it  can  be  integrated  only 
after  having  solved  the  full  problem.  If  = 0 then  the  constraint  is  scleronomic. 

The  discussion  of  Lagrange  multipliers  in  chapter  5.9.1,  showed  that,  for  virtual  displacements  8q:n 
the  correlation  of  the  generalized  coordinates,  due  to  the  constraint  forces,  can  be  taken  into  account  by 
multiplying  (6.53)  by  unknown  Lagrange  multipliers  Xk  and  summing  over  all  m constraints.  Generalized 
forces  can  be  partitioned  into  a Lagrange  multiplier  term  plus  a remainder  force.  That  is 


Q 


EX 

j 


m o 

= J2X^M  + Qfxc 


k=  1 


(6.54) 


since  by  definition  St  = 0 for  virtual  displacements. 

Chapter  5.9.1  showed  that  holonomic  forces  of  constraint  can  be  taken  into  account  by  introducing 
the  Lagrange  undetermined  multipliers  approach,  which  is  equivalent  to  defining  an  extended  Lagrangian 
L'(q,  q,  A ,t)  where 


L'( q,  q,  A ,t)  = L( q,  q ,t)  + Xk 

k— 1 j—l 


dgk 

dqj 


(q  ,t) 


(6.55) 


Finding  the  extremum  for  the  extended  Lagrangian  Z/(q,  q,  A,t)  using  (6.47)  gives 


Afc^— -(q,  t)  — Q 
dqj 


EXC 

j 


Sqj  = 0 


(6.56) 


where  Qfxc  is  the  remaining  part  of  the  generalized  force  Qj  after  subtracting  both  the  part  of  the  force 
absorbed  in  the  potential  energy  U,  which  is  buried  in  the  Lagrangian  L,  as  well  as  the  holonomic  constraint 
forces  which  are  included  in  the  Lagrange  multiplier  terms  J2T=  l Afc^Cq,  t).  The  m Lagrange  multipliers 
Xk  can  be  chosen  arbitrarily  in  (6.56) . Utilizing  the  free  choice  of  the  to  Lagrange  multipliers  Xk  allows  them 
to  be  determined  in  such  a way  that  the  coefficients  of  the  first  m infinitessimals,  i.e.  the  square  brackets 
vanish.  Therefore  the  expression  in  the  square  bracket  must  vanish  for  each  value  of  1 < j < m.  Thus  it 
follows  that 


fd,  fdL\  _ dL\ dgk 
\dt  \ dq:l)  dqj  J ^ k dqj 


(q.t)-Q?xc=0 


(6.57) 


when  j = 1, 2,  ..to.  Thus  (6.56)  reduces  to  a sum  over  the  remaining  coordinates  between  to  + 1 < j < n 


j=m+ 1 L 


d f 0L  \ dL]  \ - dgk , , . „ 

dt\dqj  dq3  J k0q:j  Q’ 


EXC 

j 


Sqj  = 0 


(6.58) 


In  equation  (6.58)  the  s = n — to  infinitessimals  Sqj  can  be  chosen  freely  since  the  s = n — m degrees 
of  freedom  are  independent.  Therefore  the  expression  in  the  square  bracket  must  vanish  for  each  value  of 
to  + 1 < j < n.  Thus  it  follows  that 


(6.59) 


where  j = m + 1,  to  + 2,  ..n.  Combining  equations  (6.57)  and  (6.59)  then  gives  the  important  general  relation 
that  for  1 < j < n 


dgk 

dqj 


(q  ,t)  + Q 


EXC 

j 


(6.60) 
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To  summarize,  the  Lagrange  multiplier  approach  (6.60)  automatically  solves  the  n equations  plus  the 
m holonomic  equations  of  constraint,  which  determines  the  n + m unknowns,  that  is,  the  n coordinates 
plus  the  to  forces  of  constraint.  The  beauty  of  the  Lagrange  multipliers  is  that  all  n variables,  plus  the  m 
constraint  forces,  are  found  simultaneously  by  using  the  calculus  of  variations  to  determine  the  extremum 
for  the  expanded  Lagrangian  L'{ q,  q,  A ,t). 


6.5.4  Generalized  forces  approach 

The  two  right-hand  terms  in  (6.60)  can  be  understood  to  be  those  forces  acting  on  the  system  that  are 
not  absorbed  into  the  scalar  potential  U component  of  the  Lagrangian  L.  The  Lagrange  multiplier  terms 
X)fcLi^fc§f -(q,  i)  account  for  the  holonomic  forces  of  constraint  that  are  not  included  in  the  conservative 
potential  or  in  the  generalized  forces  QfXC ■ The  generalized  force 


Qfxc 


IL  r> 

\ ' WA  9rt 


(6.17) 


is  the  sum  of  the  components  in  the  qj  direction  for  all  external  forces  that  have  not  been  taken  into  account 
by  the  scalar  potential  or  the  Lagrange  multipliers.  Thus  the  non-conservative  generalized  force  Qfxc 
contains  non-holonomic  constraint  forces,  including  dissipative  forces  such  as  drag  or  friction,  that  are  not 
included  in  U,  or  used  in  the  Lagrange  multiplier  terms  to  account  for  the  holonomic  constraint  forces. 

The  concept  of  generalized  forces  is  illustrated  by  the  case  of  spherical  coordinate  systems.  The  attached 
table  gives  the  displacement  elements  Sqi,  (taken  from  table  C4)  and  the  generalized  force  for  the  three 
coordinates.  Note  that  Qi  has  the  dimensions  of  force  and  Qi.Sqi  has  the  units  of  energy.  By  contrast 
equation  6.30  gives  that  Qg  = Fgr  and  Q $ = F^r  which  have  the  dimensions  of  torque.  However,  Qg59  and 
Q^Scfi  both  have  the  dimensions  of  energy  as  is  required  in  equation  6.30.  This  illustrates  that  the  units  used 
for  generalized  forces  depend  on  the  units  of  the  corresponding  generalized  coordinate. 


Unit  vectors 

Sqi 

Qi 

Qi  * $Qi 

r 

r dr 

r Fr 

Frdr 

G 

Grd9 

GFgr 

Ferd9 

<fi 

<fir  sin  9d<fi 

(fiF^r  sin  9 

F^r  sin  9d(fi 

6.6  Applying  the  Euler-Lagrange  equations  to  classical  mechanics 


d’Alembert’s  principle  of  virtual  work  has  been  used  to  derive  the  Euler-Lagrange  equations,  which  also 
satisfy  Hamilton’s  Principle,  and  the  Newtonian  plausibility  argument.  These  imply  that  the  actual  path 
taken  in  configuration  space  ( qi,<ji,t ) is  the  one  that  minimizes  the  action  integral  ft  2 L(qj,qj;t)dt.  As  a 
consequence,  the  Euler  equations  for  the  calculus  of  variations  lead  to  the  Lagrange  equations  of  motion. 


(q  H)  + QfXC 


(6.60) 


for  n variables,  with  m equations  of  constraint.  The  generalized  forces  Qfxc  are  not  included  in  the 
conservative,  potential  energy  U,  or  the  Lagrange  multipliers  approach  for  holonomic  equations  of  constraint.2 
The  following  is  a logical  procedure  for  applying  the  Euler-Lagrange  equations  to  classical  mechanics. 


1)  Select  a set  of  independent  generalized  coordinates: 

Select  an  optimum  set  of  independent  generalized  coordinates  as  described  in  chapter  6.5.1.  Use  of  generalized 
coordinates  is  always  advantageous  since  they  incorporate  the  constraints,  and  can  reduce  the  number  of 
unknowns,  both  of  which  simplify  use  of  Lagrangian  mechanics 

2Euler’s  differential  equation  is  ubiquitous  in  Lagrangian  mechanics.  Thus,  for  brevity,  it  is  convenient  to  define  the  concept 
of  the  Lagrange  linear  operator  A, , as  described  in  appendix  F 2. 

d d d 

Aj  = 

dt  dqj  dq-j 

where  Aj  operates  on  the  Lagrangian  L.  Then  Euler’s  equations  can  be  written  compactly  in  the  form  AjL  = 0. 
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2)  Partition  the  active  forces: 

The  active  forces  should  be  partitioned  into  the  following  three  groups: 

(i)  Conservative  one-body  forces  plus  the  velocity-dependent  electromagnetic  force  which 
can  be  characterized  by  the  scalar  potential  U.  that  is  absorbed  into  the  Lagrangian.  The  gravitational 
forces  plus  the  velocity-dependent  electromagnetic  force  can  be  absorbed  into  the  potential  U as  discussed 
in  chapter  6.10.  This  approach  is  by  far  the  easiest  way  to  account  for  such  forces  in  Lagrangian  mechanics. 

(ii)  Holonomic  constraint  forces  provide  algebraic  relations  that  couple  some  of  the  generalized  co- 
ordinates. This  coupling  can  be  used  either  to  reduce  the  number  of  generalized  coordinates  used,  or  to 
determine  these  holonomic  constraint  forces  using  the  Lagrange  multiplier  approach. 

(iii)  Generalized  forces  provide  a mechanism  for  introducing  non-conservative  and  non-holonomic 
constraint  forces  into  Lagrangian  mechanics.  Typically  general  forces  are  used  to  introduce  dissipative 
forces. 

Typical  systems  can  involve  a mixture  of  all  three  categories  of  active  forces.  For  example,  mechanical 
systems  often  include  gravity,  introduced  as  a potential,  holonomic  constraint  forces  are  determined  using 
Lagrange  multipliers,  and  dissipative  forces  are  included  as  generalized  forces. 

3)  Minimal  set  of  generalized  coordinates: 

The  ability  to  embed  constraint  forces  directly  into  the  generalized  coordinates  is  a tremendous  advantage 
enjoyed  by  the  Lagrangian  and  Hamiltonian  variational  approaches  to  classical  mechanics.  If  the  constraint 
forces  are  not  required,  then  choice  of  a minimal  set  of  generalized  coordinates  significantly  reduces  the 
number  of  equations  of  motion  that  need  to  be  solved  . 

4)  Derive  the  Lagrangian: 

The  Lagrangian  is  derived  in  terms  of  the  generalized  coordinates  and  including  the  conservative  forces 
buried  into  the  scalar  potential  U. 

5)  Derive  the  equations  of  motion: 

Equation  (6.60)  is  solved  to  determine  the  n generalized  coordinates,  plus  the  m Lagrange  multipliers  char- 
acterizing the  holonomic  constraint  forces,  plus  any  generalized  forces  that  were  included.  The  holonomic 
constraint  forces  then  are  given  by  evaluating  the  Xk  (q,  t)  terms  for  the  m holonomic  forces. 

In  summary,  in  Lagrangian  mechanics  is  based  on  energies  which  are  scalars  in  contrast  to  Newtonian 
mechanics  which  is  based  on  vector  forces  and  momentum.  As  a consequence,  Lagrange  mechanics  allows 
use  of  any  set  of  independent  generalized  coordinates,  which  do  not  have  to  be  orthogonal,  and  they  can 
have  very  different  units  for  different  variables.  The  generalized  coordinates  can  incorporate  the  correlations 
introduced  by  constraint  forces. 

The  active  forces  are  split  into  the  following  three  categories; 

1.  Velocity-independent  conservative  forces  are  taken  into  account  using  scalar  potentials  C/j. 

2.  Holonomic  constraint  forces  can  be  determined  using  Lagrange  multipliers. 

3.  Non-holonomic  constraints  require  use  of  generalized  forces  Qfxc. 

Use  of  the  concept  of  scalar  potentials  is  a trivial  and  powerful  way  to  incorporate  conservative  forces  in 
Lagrangian  mechanics.  The  Lagrange  multipliers  approach  requires  using  the  Euler-Lagrange  equations  for 
n + m coordinates  but  determines  for  holonomic  constraint  forces  and  equations  of  motion  simultaneously. 
Non-holonomic  constraints  and  dissipative  forces  can  be  incorporated  into  Lagrangian  mechanics  via  use  of 
generalized  forces  which  broadens  the  scope  of  Lagrangian  mechanics. 

Note  that  the  equations  of  motion  resulting  from  the  Lagrange-Euler  algebraic  approach  are  the  same 
equations  of  motion  as  obtained  using  Newtonian  mechanics.  However,  the  Lagrangian  is  a scalar  which 
facilitates  rotation  into  the  most  convenient  frame  of  reference,  and  can  greatly  simplify  determination  of 
the  equations  of  motion  when  constraint  forces  apply.  As  discussed  in  chapter  14,  the  Lagrangian  and  the 
Hamiltonian  variational  approaches  to  mechanics  are  the  only  viable  way  to  handle  relativistic,  statistical, 
and  quantum  mechanics. 
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6.7  Applications  to  unconstrained  systems 

Although  most  dynamical  systems  involve  constrained  motion,  it  is  useful  to  consider  examples  of  systems 
subject  to  conservative  forces  with  no  constraints.  For  no  constraints  the  Lagrange-Euler  equations  (6.60) 
simplify  to  A jL  = 0 where  j = 1,2 , ..n,  and  the  transformation  to  generalized  coordinates  is  of  no  conse- 
quence. 


6.1  Example:  Motion  of  a free  particle,  U=0 

The  Lagrangian  in  cartesian  coordinates  is  L = \m(x2  + y2  + i2).  Then 


dL 

dx 

dL 

8y 

dL 

dz 

dL 

dx 

Insert  these  in  the  Lagrange  equation  gives 


mx 

my 

mz 

dL  _ dL 
dy  dz 


cfdL  dL 
dt  dx  dx 


—mx  — 0 = 0 
dt 


Thus 


px  = mx  = constant 
py  = my  = constant 
pz  = mi  = constant 


That  is,  this  shows  that  the  linear  momentum  is  conserved  if  U is  a constant,  that  is,  no  forces  apply.  Note 
that  momentum  conservation  has  been  derived  without  any  direct  reference  to  forces. 


6.2  Example:  Motion  in  a uniform  gravitational  field 


Consider  the  motion  is  in  the  x — y plane.  The 
kinetic  energy  T = \m  ^ x 2 + y2^j  while  the  potential 
energy  is  U = mgy  where  U{y  = 0)  = 0.  Thus 


- mgy 


Using  the  Lagrange  equation  for  the  x coordinate 
gives 


- 

x dt  dx  dx 


d 

—mx  — 0 = 0 
dt 


Thus  the  horizontal  momentum  mx  is  conserved  and 
x = 0.  The  y coordinate  gives 


- d dL  _ dL 
v dt  dy  dy 


-rmy  + mg  = 0 
dt 


Thus  the  Lagrangian  produces  the  same  result  as  de- 
rived using  Newton’s  Laws  of  Motion. 


y 


Motion  in  a gravitational  field 


y = ~9 
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The  importance  of  selecting  the  most  convenient  generalized  coordinates  is  nicely  illustrated  by  trying  to 
solve  this  problem  using  polar  coordinates  r,  6,  where  r is  radial  distance  and  9 the  elevation  angle  from  the 
x axis  as  shown  in  the  adjacent  figure.  Then 


Thus 


A rL  = 0 for  the  r coordinate 
A qL  = 0 for  the  9 coordinate 


1 2 1 / \2 
T = -mr  + -m  [fOj 

U = mgr  sin  9 

1 .2  1 / -\2 

L = -mf  + 9m  yO)  ~ mgr  sin  9 
■ 2 

r9  — g sin  9 — r = 0 
— gr  cos  9 — 2 rf9  — r29  = 0 


These  equations  written  in  polar  coordinates  are  more  complicated  than  the  result  expressed  in  cartesian 
coordinates.  This  is  because  the  potential  energy  depends  directly  on  the  y coordinate,  whereas  it  is  a function 
of  both  r,  9.  This  illustrates  both  the  freedom  for  using  different  generalized  coordinates,  plus  the  importance 
of  choosing  a sensible  set  of  generalized  coordinates. 

6.3  Example:  Central  forces 

Consider  a mass  m moving  under  the  influence  of  a spherically- symmetric,  conservative,  attractive, 
inverse-square  force.  The  potential  then  is 

u = -k 

r 

It  is  natural  to  express  the  Lagrangian  in  spherical  coordinates  for  this  system.  That  is, 


1 


1 


1 


L = - mr 2 + -to  ( r6 ) + -m(r  sin  defy  H — 


A rL  = 0 for  the  r coordinate  gives 


• 2 *2  k 

mr  — mr[9~  + sin2  9(j>  } = — 


where  the  mr  sin2  6(j>  term  comes  from  the  centripetal  acceleration. 


A = 0 for  the  </>  coordinate  gives 


d_ 

dt 


(mr2  sin2  9(f))  = 0 


This  implies  that  the  derivative  of  the  angular  momentum  about  the  <f)  axis,  p<j>  = 0 and  thus  p $ = mr2  sin2 
is  a constant  of  motion. 

A qL  = 0 for  the  9 coordinate  gives 


d ‘ *2 

— ( mr29 ) — mr2  sin  9 cos  9<j>  = 0 


That  is, 


pe  = mr  sin  t)  cos  f 


Pi  cos  9 


2 mr2  sin3  9 

Note  that  pg  is  a constant  of  motion  if  p#  = 0 and  only  the  radial  coordinate  is  influenced  by  the  radial  form 
of  the  central  potential. 
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6.8  Applications  to  systems  involving  holonomic  constraints 

The  equations  of  motion  that  result  from  the  Lagrange-Euler  algebraic  approach  are  the  same  as  those  given 
by  Newtonian  mechanics.  The  solution  of  these  equations  of  motion  can  be  obtained  mathematically  using 
the  chosen  initial  conditions.  The  following  simple  example  of  a disk  rolling  on  an  inclined  plane,  is  useful 
for  comparing  the  merits  of  the  Newtonian  method  with  Lagrange  mechanics  employing  either  minimal 
generalized  coordinates,  the  Lagrange  multipliers,  or  the  generalized  forces  approaches. 

6.4  Example:  Disk  rolling  on  an  inclined  plane 

Consider  a disk  rolling  down  an  inclined  plane  to  com- 
pare the  results  obtained  using  Newton’s  laws  with  the  results 
obtained  using  Lagrange’s  equations  with  either  generalized 
coordinates,  Lagrange  multipliers,  or  generalized  forces.  All 
these  cases  assume  that  the  friction  is  sufficient  to  ensure  that 
the  rolling  equation  of  constraint  applies  and  that  the  disk  has 
a radius  R and  moment  of  inertia  of  I.  Assume  as  general- 
ized coordinates,  distance  along  the  inclined  plane  y which  is 
perpendicular  to  the  normal  constraint  force  N,  and  perpen- 
dicidar  to  the  inclined  plane  x,  plus  the  rolling  angle  9.  The 
constraint  for  rolling  is  holonomic 

y — R9  = 0 

The  frictional  force  is  Ff.  The  constraint  that  it  rolls  along 
the  plane  implies 

x — R = 0 


a ) Newton’s  laws  of  motion 

Newton’s  law  for  the  components  of  the  forces  along  the  inclined  plane  gives 

mg  sin  a — Ff  = my 

Perpendicular  to  the  inclined  plane,  Newton’s  law  gives 

mg  cos  a = N 


The  torque  on  the  disk  gives 
Assuming  the  disc  rolls  gives 
then 

Inserting  this  in  (a)  gives 


FfR  = 19 
y = R9 


m + — r y — mg  sin  a = 0 
Rz 


The  moment  of  inertia  of  a uniform  solid  circular  disk  is 

I = )-mR2 


Therefore 

and  the  frictional  force  is 


V = g S' sin  a 


tz  m9  ■ 

If  = ——  sm  a 


which  is  smaller  than  the  gravitational  force  along  the  plane  which  is  mg  sin  a. 


(a) 

(b) 

(c) 
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b)  Lagrange  equations  with  a minimal  set  of  generalized  coordinates 

Using  the  generalized  coordinates  defined  above,  the  total  kinetic  energy  is 

T = \™y2  + Le~ 

The  conservative  gravitational  force  can  be  absorbed  into  the  potential  energy 

U = mg(l  — y)  sin  a 


Thus  the  Lagrangian  is 
The  holonomic  equations  of  constraint  are 


1 1-2 

L = -my2  + -18  — mg(l  — y)  sin  a 


9i  = y-  R9  = 0 
<72  = x — R = 0 


A holonomic  constraint  can  be  used  to  reduce  the  system  to  a single  generalized  coordinate  y plus  generalized 
velocity  y.  Expressed  in  terms  of  this  single  generalized  coordinate,  the  Lagrangian  becomes 


L = \ ( m + jp  ) if  - mg (l  - y)  sina 


The  Lagrange  equation  A yL  = 0 gives 


mg  sin«=  ( m + — r ) y 
R- 


Again  if  I = \mR 2 then 


2 

V = g 0 sin  a 


The  solution  for  the  x coordinate  is  trivial.  This  answer  is  identical  to  that  obtained  using  Newton’s  laws 
of  motion.  Note  that  no  forces  have  been  determined  using  the  single  generalized  coordinate. 


c)  Lagrange  equation  with  Lagrange  multipliers 

Again  the  conservative  gravitation  force  is  absorbed  into  the  scalar  potential  while  the  holonomic  constraints 
are  taken  into  account  using  Lagrange  multipliers.  Ignoring  the  trivial  x dependence,  the  Lagrangian  is  given 
above  to  be 

1 1-2 

L = -my2  + -16  — mg(l  — y)  sina 

The  constraint  equations  are 

0i  = y-R6  = 0 
<72  = x — R = 0 

The  Lagrange  equation  for  the  y coordinate 

d^dL  dL  _ dgi  , ^ n 
dt  dy  dy  1 dy  2 


gives 


my  — mg  sin  a = Ai 


The  Lagrange  equation  for  the  9 coordinate 
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which  gives 

The  constraint  can  be  written  as 


19  = —XiR 
y = R6 


Let  I = \MR 2 and  solve  for  y,  9 and  A gives 


The  frictional  force  is  given  by 


Also 


mg  mg 

Ai  = — sin  a = — — sin  a 

(i  + 2f^)  3 

77?  , dgi  . mg  . 

Ff  = Ai—  = Ai  = -—sin  a 
ay  3 


my  — mg  sin  a + Ai  = -mg  sin  a 

O 


and  the  torque  is 


-XiR  = FfR  = 19 


d)  Lagrange  equation  using  a generalized  force 

Again  the  conservative  gravitation  force  is  absorbed  into  the  scalar  potential  while  the  holonomic  constraints 
are  taken  into  account  using  generalized  forces.  Ignoring  the  trivial  x dependence,  the  Lagrangian  was  given 
above  to  be 

1 1-2 

L = -my2  + —Id  — mg(l  — y)  sin  a 

The  generalized  forces  (6.30)  are 

Qy  = ~~Ff 

Qg  = FfR 

The  Euler-Lagrange  equations  are: 

The  AyL  = Qy  Lagrange  equation  for  the  y coordinate 

my  — mg  sin  a = Qy  = —Ff 


The  AgL  = Qg  Lagrange  equation  for  the  9 coordinate 

19  = Qg  = FfR 

The  constraint  equation  gives  that  y = R0  and  assuming  I = | mR 2 leads  to  the  Qg  relation 

Qe  „ m.. 

-R=F>=2V 

Substitute  this  equation  into  the  Qy  relation  gives  that 


m . 


Thus 


and 


my  - mg  sin  a = Qy  = -Ff  = —y 


y = g 5 sm  a 


t?  m9  ■ 
r f = — sin  a 


The  four  methods  for  handling  the  equations  of  constraint  all  are  equivalent  and  result  in  the  same 
equations  of  motion.  The  scalar  Lagrangian  mechanics  is  able  to  calculate  the  vector  forces  acting  in  a direct 
and  simple  way.  The  Newton’s  law  approach  is  more  intuitive  for  this  simple  case  and  the  ease  and  power 
of  the  Lagrangian  approach  is  not  apparent  for  this  simple  system. 

The  following  series  of  examples  will  gradually  increase  in  complexity,  and  will  illustrate  the  power, 
elegance,  plus  superiority  of  the  Lagrangian  approach  compared  with  the  Newtonian  approach. 
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6.5  Example:  Two  connected  masses  on  frictionless  inclined  planes 


Consider  the  system  shown  in  the  figure.  This  is 
a problem  that  has  five  constraints  that  will  be  solved 
using  the  method  of  generalized  coordinates.  The  ob- 
vious generalized  coordinates  are  x\  and  x<2  which  are 
perpendicular  to  the  normal  constraint  forces  on  the 
inclined  planes.  Another  holonomic  constraint  is  that 
the  length  of  the  rope  connecting  the  masses  is  assumed 
to  be  constant.  Thus  the  equation  of  constraint  is  that 

Xl  + X2  — l = 0 

The  other  four  constraints  ensure  that  the  two  masses 
slide  directly  down  the  inclined  planes  in  the  plane 
shown.  This  is  assumed  implicitly  by  using  only  the 
variables,  X\  and  X2-  Let  us  chose  x\  as  the  primary 
generalized  coordinate,  thus 


Two  connected  masses  on  frictionless  inclined 
planes 


X-2  = l — Xl 
yi  = x\  sin  6 1 
2/2  = (l  - Xi)  sin  #2 


The  conservative  gravitational  force  is  absorbed  into  the  potential  energy  given  by 

U = —migxi  sin  — m2g  ( l — x\)  sin  62 


Since  x\  = — X2  the  kinetic  energy  is  given  by 

T = ^m !±\  + ^m2±l  = ^ (mi  + m2)  xj 

The  Lagrangian  then  gives  that 


L = - (mi  + m2)  x\  + migxi  sin0i  + m2g  ( l — xfi)  sin02 


Therefore 


Thus 


dL 

dii 

dL 

dx\ 


(mi  + m2)  x 1 

g ( mi  sin  9\  — m2  sin  02) 


A T - 

Xl  dt  Oxi 


dL 
dx  1 


= 0 = (mi  + m2)  xi  — g (mi  sin  9i  — m2  sin  62) 


Note  that  the  system  acts  as  though  the  inertial  mass  is  (mi  + m2) 
while  the  driving  force  comes  from  the  difference  of  the  forces.  The 
acceleration  is  zero  if 


mi  sin  0i  = m2  sin  0 2 

A special  case  of  this  is  the  Atwood’s  machine  with  a massless 
pulley  shown  in  the  adjacent  figure.  For  this  case  0i  = 02  = 90°. 
Thus 

(mi  + m2)  Xi  = g (mi  - m2) 


Atwoods  machine 


Note  that  this  problem  has  been  solved  without  any  reference  to  the 
force  in  the  rope  or  the  normal  constraint  forces  on  the  inclined  planes. 
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6.6  Example:  Two  blocks  connected  by  a friction- 
less bar 


Two  identical  masses  m are  connected  by  a massless 
rigid  bar  of  length  l,  and  they  are  constrained  to  move 
in  two  frictionless  slides,  one  vertical  and  the  other  hor- 
izontal as  shown  in  the  adjacent  figure.  Assume  that  the 
conservative  gravitational  force  acts  along  the  negative  y 
axis  and  is  incorporated  into  the  scalar  potential  U.  The 
generalized  coordinate  can  be  chosen  to  be  the  angle  a 
corresponding  to  a single  degree  of  freedom.  The  relative 
cartesian  coordinates  of  the  blocks  are  given  by 

x = l cos  a 
y = l sin  a 

Thus 

x = —l( sin  a) a 

y = /(cosa)d 

This  constraint,  that  is  absorbed  into  the  generalized  co- 
ordinate, is  holonomic,  scleronomic,  and  conservative. 
The  kinetic  energy  is  given  by 


Two  frictionless  masses  that  are  connected  by  a 
bar  and  are  constrained  to  slide  in  vertical  and 
horizontal  channels. 


T = im  (Z2(sina)2d2  + Z2(cosa)2d2)  = ^ ml2 a 2 

The  gravitational  potential  energy  is  given  by 

U = mgy  = mgl  sin  a 


Thus  the  Lagrangian  is 


1 


L = -ml  a — mgl  sin  a 


Using  the  Lagrange  operator  equation  AaL  = 0 gives 

ml2 a + mgl  cos  a = 0 

a + j cos  a = 0 


Multiply  by  a yields 

This  can  be  integrated  to  give 

where  c is  a constant.  That  is 

Separation  of  the  variable  gives 

Integration  of  this  gives 


aa  + yd  cos  a = 0 


l.o  9 ■ 

-a  + y sm  a = c 


a = 


dt  = 


t-t0  = 


c — - sm  a 


da 


\/ 2 (c-  f sina) 
(a  da 


'ao  ^2  (c  — f sin  a) 

The  constants  c and  to  are  determined  from  the  given  initial  conditions. 
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6.7  Example:  Block  sliding  on  a movable  frictionless  inclined  plane 


Consider  a block  of  mass  m free  to  slide  on  a smooth 
frictionless  inclined  plane  of  mass  M that  is  free  to  slide 
horizontally  as  shown  in  the  adjacent  figure.  The  six  de- 
grees of  freedom  can  be  reduced  to  two  independent  gen- 
eralized coordinates  since  the  inclined  plane  and  mass  m 
are  confined  to  slide  along  specific  non- orthogonal  direc- 
tions. Choose  x as  the  coordinate  for  movement  of  the 
inclined  plane  in  the  horizontal  i direction  and  x'  the 
position  of  the  block  with  respect  to  the  surface  of  the 
inclined  plane  in  the  e direction  which  is  inclined  down- 
ward at  an  angle  9.  Thus  the  velocity  of  the  inclined 
plane  is 

V = ix 

while  the  velocity  of  the  small  block  on  the  inclined  plane 
is 


A block  sliding  on  a frictionless  movable  inclined 
plane. 


v = ix  + ex' 


The  kinetic  energy  is  given  by 


T = -MV  • V+-mv  • v = -Mi2  + -to[x2  + x'2  + 2xx'  cos  8] 


The  conservative  gravitational  force  is  absorbed  into  the  scalar  potential  energy  which  depends  only  on  the 
vertical  position  of  the  block  and  is  taken  to  be  zero  at  the  top  of  the  wedge. 


U = —mgx'  sin# 


Thus  the  Lagrangian  is 

L = - VI  x2  + -m[x2  + x!2  + 2xx'  cos  9}  + mgx ' sin  9 
Consider  the  Lagrange- Euler  equation  for  the  x coordinate,  A XL  = 0 which  gives 


dt 


[m (a;  + x!  cos  9)  + M x]  = 0 


(a) 


which  states  that  [m(x  + x' cos  9)  + Mx]  is  a constant  of  motion.  This  constant  of  motion  is  just  the  total 
linear  momentum  of  the  complete  system  in  the  x direction.  That  is,  conservation  of  the  linear  momentum 
is  satisfied  automatically  by  the  Lagrangian  approach.  The  Newtonian  approach  also  predicts  conservation  of 
the  linear  momentum  since  there  are  no  external  horizontal  forces, 

Consider  the  Lagrangian  equation  for  the  x’  coordinate  A X>L  = 0 which  gives 


— \x'  + x cos  9}  = q sin  9 
dt 1 1 

Perform  both  of  the  time  derivatives  for  equations  a and  b give 


( b ) 


m[x  + x'  cos  6]  + Mx  = 0 

x'  + x cos  9 = g sin  9 


Solving  for  x and  x'  gives 


—gsm9cos9 
(to  + M)/m  — cos2  9 


and. 


g sin  9 

1 — to  cos2  9/ (to  + M) 


This  example  illustrates  the  flexibility  of  being  able  to  use  non- orthogonal  displacement  vectors  to  specify  the 
scalar  Lagrangian  energy.  Newtonian  mechanics  would  require  more  thought  to  solve  this  problem. 
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6.8  Example:  Sphere  rolling  without  slipping  down  an  inclined  plane  on  a 
frictionless  floor. 

A sphere  of  mass  m and  radius  r rolls,  without  slipping,  down  an  inclined  plane,  of  mass  M,  sitting  on  a 
frictionless  horizontal  floor  as  shown  in  the  adjacent  figure.  The  velocity  of  the  rolling  sphere  has  horizontal 
and  vertical  components  of 

vx  = x + R9  cos  <p 
vy  = —ROsvcup 


Assume  initial  conditions  are  t = 0,  £ = 0,  x = 0, 9 = 0,  y = h,  x = 6 = 0.  Choose  the  independent  coordinates 
x and  9 as  generalized  coordinates  plus  the  holonomic  constraint  £ = R9.  Then  the  Lagrangian  is 


r M . 2 TO 
i = +2 


x1 2  + r29  + 2 r±9  cos  ip 


+ ™r29  — mg  (h  — r9  sin  p) 


Lagrange’s  equations  A XL  = 0 and  A gL  = 0;  give 


(M  + to)  x + mr9  cos  ip  = 0 
7 •• 

x cos  ip  H — r9  — g sin  <p  = 0 

5 

Eliminating  x gives 


to  cos  p 
A 1 + m 


9 = g 


smy> 


Integrate  this  equation  assuming  the  initial  conditions, 
results  in 


e = 5 (M  + to)  siny  t2 

2 [7  (M  + to)  — 5 to  cos2  <p\  ^ 

Thus 

mr  cos  ip  5msin(2(/?)  . 

M + to  4 [7  (M  + to)  — 5 to  cos2  <p\  ^ 


y 


Solid  sphere  rolling  without  slipping  on  an 
inclined  plane  on  a frictionless  horizontal  floor. 


Note  that  these  equations  predict  conservation  of  linear 
momentum  for  the  block  plus  sphere. 


6.9  Example:  Mass  sliding  on  a rotating  straight  frictionless  rod. 


Consider  a mass  m sliding  on  a frictionless  rod  that 
rotates  about  one  end  of  the  rod  with  an  angular  velocity 
9.  Choose  r and  9 to  be  generalized  coordinates.  Then 
the  kinetic  energy  is  given  by 

1 2 1 2 J.2 

T = - mm  H — mr29 

2 2 

and  potential  energy 

U = 0 


The  Lagrange  eqiLation  for  9 gives 


d dL  dL 
6 ^Jt~d9^~d9 


mr29 ) = 0 
dC  ’ 


Mass  sliding  on  a rotating  straight  frictionless 
rod. 


Thus  the  angular  momentum  is  constant 
mr29  = constant  = po 
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The  Lagrange  equation  for  r gives 


A rL 


d_dL 
dt  dr 


— - = mr  — mrO  = 0 
or 


The  9 equation  states  that  the  angular  momentum  is  conserved  for  this  case  which  is  what  we  expect  since 
there  are  no  external  torques  acting  on  the  system.  The  r equation  states  that  the  centrifugal  acceleration  is 
r = rev2.  These  equations  of  motion  were  derived  without  reference  to  the  forces  between  the  rod  and  mass. 


6.10  Example:  Spherical  pendulum 

The  spherical  pendulum  is  a classic  holonomic 
problem  in  mechanics  that  involves  rotation  plus  os- 
cillation where  the  pendulum  is  free  to  swing  in  any 
direction.  This  also  applies  to  a particle  constrained 
to  slide  in  a smooth  frictionless  spherical  bowl  under 
gravity,  such  as  a bar  of  soap  in  a wet  hemispherical 
sink.  Consider  the  equation  of  motion  of  the  spher- 
ical pendulum  of  mass  m and  length  b shown  in  the 
adjacent  figure.  The  most  convenient  generalized  co- 
ordinates are  r,  9,  with  origin  at  the  fulcrum,  since 
the  length  is  constrained  to  be  r = b.  The  kinetic 
energy  is 

T = -mb29  + - mb2  sin2  9(f) 

The  potential  energy 

U = — mgb  cos  9 

giving  that 


Spherical  pendulum 


The  Lagrange  equation  for  9 

which  gives 

The  Lagrange  equation  for  <f> 


. r _ ddL  OL  _ 
e dt  d9  d9 


mb2 9 = mb2(j)  sin  9 cos  9 — mgb  sin  9 


. r d dL  dL  d . ,o  . 2 n'n 
* dtd</>  d(f>  dt[  ' 


which  gives 

mb2  sin2  9(j)  = Pc/,  = constant 

This  is  just  the  angular  momentum  p ^ for  the  pendulum  rotating  in  the  <j>  direction.  Automatically  the 
Lagrange  approach  shows  that  the  angular  momentum  p $ is  a conserved  quantity.  This  is  what  is  expected 
from  Newton’s  Laws  of  Motion  since  there  are  no  external  torques  applied  about  this  vertical  axis. 

The  equation  of  motion  for  9 can  be  simplified  to 


q 

!+  ^-sinl 
b 


pi  cos  9 
m264  sin3  9 


There  are  many  possible  solutions  depending  on  the  initial  conditions.  The  pendulum  can  just  oscillate 
in  the  9 direction,  or  rotate  in  the  (f  direction  or  some  combination  of  these.  Note  that  if  p$  is  zero,  then 
the  equation  reduces  to  the  simple  harmonic  pendulum,  while  the  other  extreme  is  when  9 = 0 for  which  the 
motion  is  that  of  a conical  pendidum  that  rotates  at  a constant  angle  9q  to  the  vertical  axis. 
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6.11  Example:  Spring  plane  pendulum 

A mass  m is  suspended  by  a spring  with  spring  constant  k in  the  gravitational  field.  Besides  the  longi- 
tudinal spring  vibration,  the  spring  performs  a plane  pendulum  motion  in  the  vertical  plane,  as  illustrated  in 
the  adjacent  figure.  Find  the  Lagrangian,  the  equations  of  motion,  and  force  in  the  spring. 

The  system  is  holonomic,  conservative,  and  scleronomic.  Introduce  plane  polar  coordinates  with  radial 
length  r and  polar  angle  9 as  generalized  coordinates.  The  generalized  coordinates  are  related  to  the  cartesian 
coordinates  by 


y = r cos  6 
x = r sin  6 

Therefore  the  velocities  are  given  by 

y = r cos  9 + r9  sin  9 
x = r sin  9 — rO  cos  9 

The  kinetic  energy  is  given  by 

T = ^ to  ( x 2 + y 2)  = (r2  + r29~^j 

The  gravitational  plus  spring  potential  energies  both  can  be  absorbed 
into  the  potential  U. 

U = —mgr  cos  9 + ^ (r  — ro)2 


Spring  pendulum  having  spring 
constant  k and  oscillating  in  a 
vertical  plane. 


where  ro  denotes  the  rest  length  of  the  spring.  The  Lagrangian  thus  equals 


L = 


= -to  (r2  + r29  ^ + mgr  cos  9 — — (r  — ro)2 


For  the  polar  angle  9,  the  Lagrange  equation  A gL  = 0 gives 

(nxr29^j  = —mgr sin 9 

The  angular  momentum  pg  = mr29,  thus  the  equation  of  motion  can  be  written  as 


pg  = —mgr  sin  9 


Alternatively,  evaluating 


gives 


mr29  = —mgr  sin  9 — 2 mrr9 

The  last  term  in  the  right-hand  side  is  the  Coriolis  force  caused  by  the  time  variation  of  the  pendulum  length. 
For  the  radial  distance  r,  the  Lagrange  equation  A rL  = 0 gives 

. 2 

mr  = mr9  + mg  cos  9 — k(r  — r o) 

This  equation  just  equals  the  tension  in  the  spring,  i.e.  F = mr.  The  first  term  on  the  right-hand  side 
represents  the  centrifugal  radial  acceleration,  the  second  term  is  the  component  of  the  gravitational  force, 
and  the  third  term  represents  Hooke’s  Law  for  the  spring.  For  small  amplitudes  of  9 the  motion  appears  as 
a superposition  of  harmonic  oscillations  in  the  r,  9 plane. 

In  this  example  the  orthogonal  coordinate  approach  used  gave  the  tension  in  the  spring  thus  it  is  unnec- 
essary to  repeat  this  using  the  Lagrange  multiplier  approach. 
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6.12  Example:  The  yo-yo 

Consider  a yo-yo  comprising  a disc  that  has  a string  wrapped  around  it  with  one  end  attached  to  a fixed 
support.  The  disc  is  allowed  to  fall  with  the  string  unwinding  as  it  falls  as  illustrated  in  the  adjacent  figure. 
Derive  the  equations  of  motion  and  the  forces  of  constraint  via  use  of  Lagrange  multipliers.  Use  y and  <f>  as 
independent  generalized  coordinates. 

The  kinetic  energy  of  the  falling  yo-yo  is  given  by 


rp  1 -2  A 1rj2 

T = -my  + -If 


1 


1 


-my  + -ma 


, \2 


where  m is  the  mass  of  the  disc,  a the  radius,  and  I = 
\ma2  is  the  moment  of  inertia  of  the  disc  about  its  central 
axis.  The  potential  energy  of  the  disc  is 


U = — mgy 


Thus  the  Lagrangian  is 

1 


1 


L = -my  + -ma‘ 


<S>  +mgy 


The  one  equation  of  constraint  is  holonomic 
g(y,  <t>)  = y-a<t>  = 0 
The  two  Lagrange  equations  are 

dL  d dL  dg 
dy  ~ dtW  +Xhy  = 


The  yo-yo  comprises  a falling  disc  unrolling 
from  a string  attached  to  the  disc  at  one  end 
and  a fixed  support  at  the  other  end. 


dL  d dL  dg_ 

df  dt  d(j>  df 


0 


with  only  one  Lagrange  multiplier  A.  Evaluating  these  two  Eider- Lagrange  equations  leads  to  two  equations 
of  motion 


mg 

-\ma 


my  + A 
2'k  — Xa 


Differentiating  the  equation  of  constraint  gives 


y 

a 


Inserting  this  into  the  second  equation  and  solving  the  two  equations  gives 


A = -\m9 


Inserting  A into  the  two  equations  of  motion  gives 


The  generalized  force  of  constraint 


and  the  constraint  torque  is 


V = 


dg 


39 
3 a 


1 


Fv  = Air:  = -o  m9 


Nlh  = A 


dy 

dg 


1 


~mg  a 


dip  3 

Thus  the  string  reduces  the  acceleration  of  the  disc  in  the  gravitational  field  by  a factor  of  1 . 
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6.13  Example:  Mass  constrained  to  move  on  the  inside  of  a frictionless  paraboloid 

A mass  m moves  on  the  frictionless  inner  surface  of  a paraboloid 

2,2  2 

x + y = p = az  z 

with  a gravitational  potential  energy  of  U = mgz. 

This  system  is  holonomic,  scleronomic,  and  conservative.  Choose 
cylindrical  coordinates  p,  <j>,  z with  respect  to  the  vertical  axis  of  the 
paraboloid  to  be  the  generalized  coordinates. 

The  Lagrangian  is 

L = -to  ^ p 2 + p2(f)  + z2^j  — mgz 

The  equation  of  constraint  is 

9(p , z)  = p2  - az  = 0 

The  Lagrange  multiplier  approach  will  be  used  to  determine  the  forces 
of  constraint. 


For  APL  = A §2 


Mass  constrained  to  slide  on  the 
inside  of  a frictionless  paraboloid. 


For  A0L  = A§§ 


d_DL  _ dL 
dt  dr  dr 

(p  ~ P02) 


TO 


— Ai2  p 
= Ai2  p 


(a) 


s(m^)=  P*  = 0 

Thus  the  angular  momentum  p $ is  conserved,  that  is,  it  is  a constant  of  motion. 
For  A ZL  = A§f 

mz  = — mg  — \±a 

and  the  time  differential  of  the  constraint  equation  is 

2 pp  — az  = 0 


(b) 

(c) 

(d) 


The  above  four  equations  of  motion  can  be  used  to  determine  r,f.z, 

The  radius  of  the  circle  at  the  intersection  of  the  plane  z = h,  with  the  paraboloid  p2  = az,  is  given  by 
p0  = Vah.  For  a constant  height  z = h,  then  z = 0 and  equation  (c)  reduces  to 

A = - — 
a 


Therefore  the  constraint  force  Fc  is  given  by 


„ _ x dg(p,  z) 

rc  — Ai 7. 

dp 


mg 

a 


2 P 


Assuming  that  p = 0,  then  equation  (a)  for  <p  = to  and  p = p0  gives 


to  (0  - p0u>2)  = Ai2p0  = --^2 p0  = Fc 
That  is,  the  constraint  force  equals 

Fc  = - mp0w 2 

which  is  the  usual  centripetal  force.  These  relations  also  give  that  the  initial  angular  velocity  required  for 
such  a stable  trajectory  with  height  h is 
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6.14  Example:  Mass  on  a frictionless  plane  connected  to  a plane  pendulum 


The  masses  mi  and  m2  are  connected  by  a string  of 
length  l.  Mass  m\  is  on  a horizontal  frictionless  table 
and  it  is  assumed  that  mass  m2  moves  in  a vertical  plane. 
This  is  another  problem  involving  holonomic  constrained 
motion.  The  constraints  are: 

1)  m\  moves  in  the  horizontal  plane 

2)  m2  moves  in  the  vertical  plane 

3)  r + s = l.  Therefore  r = —s 

There  are  6 — 3 = 3 remaining  degrees  of  freedom  after 
taking  the  constraints  into  account.  Choose  as  a set  of 
generalized  coordinates,  r , 9,  and  ft.  In  terms  of  these  three 
generalized  coordinates,  the  kinetic  energy  is 


The  potential  energy  in  terms  of  the  generalized  coordi- 
nates relative  to  the  horizontal  plane,  is 


Mass  m2 , hanging  from  a rope  that  is  connected 
to  mi,  which  slides  on  a frictionless  plane. 


U = 0 — m2gr  cos  9 
Therefore  the  Lagrangian  equals 

L = ^toi  (r2  + (l  — r)2  ij)  ^ + i?7i2  (r2  + r29  ^ + 77123?’ cos  $ 


The  differentials  are 


OL 

dr 

dL 

dr 

dL 

~d9 

dL_ 

~d9 

dL 

d<j> 

dL 

dip 

Thus  the  three  Lagrange  equations  are 


■ Z • 2 

— m(l  — r)(j>  +m2r9  + mgr  cos  9 
(• mi  + m2)r 
— mgr  sin# 
m2r29 
0 

mi  (l  — r)2  <j) 


A rL 
A eL 


•2  -2 

(mi  + ?7i2)r  + mi(l  — r)<p  — m2r9  — 777,23  cos  9 = 0 


+ m2gr  sin  9 = 0 


that  is 


2m2r9  + r2m20  + m2gr  sin  9 = 0 


ML  — 


d_  r 
dt 


mi  ( l — r)2  (f> 


= 0 


This  last  equation  is  a statement  of  the  conservation  of  angular  momentum.  These  three  differential  equations 
of  motion  can  be  solved  for  known  initial  conditions. 
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6.15  Example:  Two  connected  masses  constrained  to  slide  along  a moving  rod 

Consider  two  identical  masses  m,  constrained  to  move 
along  the  axis  of  a thin  straight  rod,  of  mass  M and  length 
l,  which  is  free  to  both  translate  and  rotate.  Two  identi- 
cal springs  link  the  two  masses  to  the  central  point  of  the 
rod.  Consider  only  motions  of  the  system  for  which  the 
extended  lengths  of  the  two  springs  are  equal  and  opposite 
such  that  the  two  masses  always  are  equal  distances  from 
the  center  of  the  rod  keeping  the  center  of  mass  at  the 
center  of  the  rod.  Find  the  equations  of  motion  for  this 
system. 

Use  a fixed  cartesian  coordinate  system  ( x , y,  z)  and 
a moving  frame  with  the  origin  O at  the  center  of  the 
rod  with  its  cartesian  coordinates  ( x',y',z ')  being  parallel 
to  the  fixed  coordinate  frame  as  shown  in  the  figure.  Let 
(r,  9,  p)  be  the  spherical  coordinates  of  a point  referring  to 
the  center  of  the  moving  (a :',y',z')  frame  as  shown  in  the 
figure.  Then  the  two  masses  m have  spherical  coordinates 
(r,  9 , <p)  and  (— r,  9,  p)  in  the  moving-rod  fixed  frame.  The 
frictionless  constraints  are  holonomic. 

The  kinetic  energy  of  the  system  is  equal  to  the  kinetic  energy  for  all  the  mass  concentrated  at  the  center 
of  mass  plus  the  kinetic  energy  about  the  center  of  mass.  Since  O is  the  center  of  mass  then  the  kinetic 
energy  can  be  separated  into  three  terms 

rn  rp  , rprUCLSSeS  i rprod 

1 1 cm  i J-rot  ' -L  rot 

Note  that  since  the  kinetic  energy  is  a scalar  quantity  it  is  rotational  invariant  and  thus  can  be  evaluated  in 
any  rotated  frame.  Thus  the  kinetic  energy  of  the  center  of  mass  is 

Tcm  = 2 0^  + 2 m)(x2  + y2  + i2) 

The  rotational  kinetic  energy  of  the  two  masses  in  the  center  of  mass  frame  is 

Trasses  = m(-2  + ^ + ^2  ^2  ^ 

The  rotational  kinetic  energy  of  the  rod  Tf°?  is  a scalar  and  thus  can  be  evaluated  in  any  rotated  frame  of 
reference  fixed  with  respect  to  the  principal  axis  system  of  the  rod.  The  angular  velocity  of  the  rod  about  O 
resolved  along  its  principal  axes  is  given  by 

ui  = p cos  9er  — p sin  9eg  — 9e^ 

The  corresponding  moments  of  inertia  of  the  uniform  infinitesimally-thin  rod  are  Ir  = 0 ,Ig  = II2,  Iv  = 
-LMl2 . Hence  the  rotational  kinetic  energy  of  the  rod  is 

Tr°t  — 2^rU^  + + = 24 + T2  sin2  #) 

The  only  potential  energy  is  due  to  the  two  extended  springs  which  are  assumed  to  have  the  same  length  r 
where  ro  is  the  unstretched  length. 

U = 2 ■ ^Iifr  - r0)2  = K(r  - r0)2 

Thus  the  Lagrangian  is 

L = i(M  + 2 m){x2  + y2  + z2)  + m(r2  + r292  + r2p2  sin2  9)  + ^ ML2 {6 2 + p2  sin2  9)  - K(r  - ro)2 
Using  Lagrange ’s  equations  Aqi  L = 0 for  the  generalized  coordinates  gives. 


Two  identical  masses  m constrained  to  slide  on 
a moving  rod  of  mass  M.  The  masses  are 
attached  to  the  center  of  the  rod  by  identical 
springs  each  having  a spring  constant  K. 
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{VI  + 2 m)x 

= constant 

(M  + 2 m)y 

= constant 

{VI  + 2 m)z 

= constant 

^2 mr2  + j^VIl2^  yjsin2  9 

= constant 

•2  K 

r — r6  — rip2  sin2  9 H (r  — ro) 

m 

= 0 

( o Ml2\  ■■  ■ ( 2 ml 2 \ 2 

( r2  + — — ) 0 + 2 rf  O — ( r2  + — — ) (p2  sin  0 cos  0 = 0 

V 24m ) \ 24  m / 


(A,L  = 0) 
(Ayi  = 0) 
(AZL  = 0) 

(A  VL  = 0) 
(A  rL  = 0) 
(A  0L  = 0) 


The  first  three  equations  show  that  the  three  components  of  the  linear  momentum  of  the  center  of  mass 
are  constants  of  motion.  The  fourth  equation  shows  that  the  component  of  the  angular  momentum  about 
the  z'  axis  is  a constant  of  motion.  Since  the  z±  axis  has  been  arbitrarily  chosen  then  the  total  angular 
momentum  must  be  conserved.  The  fifth  and  sixth  equations  give  the  radial  and  angular  equations  of  motion 
of  the  oscillating  masses  m. 


6.9  Applications  involving  non-holonomic  constraints 

In  general,  non-holonomic  constraints  can  be  handled  by  use  of  generalized  forces  Qfxc  in  the  Lagrange- 
Euler  equations  6.60.  The  following  examples,  6.16  — 6.19,  involve  one-sided  constraints  which  exhibit 
holonomic  behavior  for  restricted  ranges  of  the  constraint  surface  in  coordinate  space,  and  this  range  is  case 
specific.  When  the  forces  of  constraint  press  the  object  against  the  constraint  surface,  then  the  system  is 
holonomic,  but  the  holonomic  range  of  coordinate  space  is  limited  to  situations  where  the  constraint  forces 
are  positive.  When  the  constraint  force  is  negative,  the  object  flies  free  from  the  constraint  surface.  In 
addition,  when  the  frictional  force  F > N gstatic  where  pstatic  is  the  static  coefficient  of  friction,  then  the 
object  slides  negating  any  rolling  constraint  that  assumes  static  friction. 


6.16  Example:  Mass  sliding  on  a frictionless  spherical  shell 


Consider  a mass  starts  from  rest  at  the  top  of  a frictionless 
fixed  spherical  shell  of  radius  R.  The  questions  are  what  is  the 
force  of  constraint  and  determine  the  angle  9 at  which  the  mass 
leaves  the  surface  of  the  spherical  shell.  The  coordinates  r,  6 shown 
are  the  obvious  generalized  coordinates  to  use.  The  constraint  will 
not  apply  if  the  force  of  constraint  does  not  hold  the  mass  against 
the  surface  of  the  spherical  shell,  that  is,  it  is  only  holonomic  in  a 
restricted  domain. 

The  Lagrangian  is 


L = ^m  ^r2  + r202^ 


mgr  cos  9 


This  Lagrangian  is  applicable  irrespective  of  whether  the  constraint 
is  obeyed,  where  the  constraint  is  given  by 


Mass  m sliding  on  frictionless  cylinder 
of  radius  R. 


g{r,  9)  = r — R = 0 


For  the  restricted  domain  where  this  system  is  holonomic,  it  can  be  solved  using  generalized  coordinates, 
generalized  forces,  Lagrange  multipliers,  or  Newtonian  mechanics  as  illustrated  below. 

Minimal  generalized  coordinates: 

The  minimal  number  of  generalized  coordinates  reduces  the  system  to  one  coordinate  9,  which  does  not 
determine  the  constraint  force  that  is  needed  to  know  if  the  constraint  applies.  Thus  this  approach  is  not 
usefid  for  solving  this  partially-holonomic  system. 
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Generalized  forces: 

The  radial  constraint  has  a corresponding  generalized  force  Qr.  The  Lagrange  equation  A rL  = Qr  gives 

. 2 

mf  + mg  cos  9 — mrO  = Qr  (a) 

The  Lagrange  equation  AgL  = Qg  = 0 since  there  is  no  tangential  force  for  this  frictionless  system.  Therefore 

mr29  — mgr  sin  9 + 2mrr9  = 0 (b) 

When  constrained  to  follow  the  surface  of  the  spherical  shell,  the  system  is  holonomic,  i.e.  r = R and 
r = r = 0.  Thus  the  above  two  equations  reduce  to 


mg  cos  9 — mR9  = Qr 
mR29  — mgR  sin  9 = 0 


(c) 


That  is 


Integrate  to  get  9 using  the  fact  that 


then 


Therefore 


A s^n  ^ 
R 


■■  d9  d9  ■ d9 
d9  dt  d9 


9d9  = / 9d9  = / sin  9d9 


92  = %(l-cos  9) 

R 


(d) 


assuming  that  9 = 0 at  9 = 0.  Substituting  equation  (d)  into  equation  (c)  gives  the  constraint  force,  which 
is  normal  to  the  surface,  to  be 

F = Qr  = mg{2>  cos  9 — 2) 

Note  that  F = Qr  = 0 when  cos 9 = that  is  9 = 48.2°. 

Lagrange  multipliers: 

For  the  holonomic  regime,  which  obeys  the  constraint,  g(r,  9)  = r — R = 0,  the  Lagrange  equation  for  r 
is  A rL  = A ff.  Since  = 1,  then 


The  Lagrange  equation  for  9 gives  A gL  = A||  = 0 since  §f  = 0.  Thus 


mr  + mg  cos  9 — mr9~  = A 
me 

mr29  — mgr  sin  9 + 2 mrr9  = 0 


(a) 


(b) 


As  above,  when  constrained  to  follow  the  surface  of  the  spherical  shell,  the  system  is  holonomic  r = R, 
and  r = r = 0.  Thus  the  above  two  equations  reduce  to 


mg  cos  9 — mR9"  = A 
mR29  — mgRsin9  = 0 

That  is,  the  answers  are  identical  to  that  obtained  using  generalized  forces,  namely; 

9 = Ap-  (1  — cos  9) 

R 

assuming  that  9 = 0 at  9 = 0. 

The  force  of  constraint  applied  by  the  surface  is 


(c) 

(d) 


(d) 
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Substituting  equation  ( d ) into  equation  (c)  gives 

F = A = mg(  3 cos  6 — 2) 

Note  that  A = 0 when  cos  6 = |,  that  is  9 = 48.2°. 

Both  of  the  above  methods  give  identical  results  and  give  that  the  force  of  constraint  is  negative  when 
9 > 48.2°.  Assuming  that  the  surface  cannot  hold  the  mass  against  the  surface,  then  the  mass  will  fly  off  the 
spherical  shell  when  9 > 48.2°  and  the  system  reduces  to  an  unconstrained  object  falling  freely  in  a uniform 
gravitational  field,  which  is  holonomic,  that  is  Qr  = A = 0.  Then  the  equations  of  motion  (a)  and  ( b ) reduce 
to 


• 2 

mr  + mg  cos  9 — mr9  = 0 

mr29  — mgr  sin  9 + 2 mrr9  = 0 


(e) 

(f) 


Energy  conservation: 

This  problem  can  be  solved  using  energy  conservation 


1 2 
-mv 


— mgR[  1 — cos0] 


Thus  the  centripetal  acceleration 

v2 

— = 2g[l  - cos0] 

The  normal  force  to  the  surface  will  cancel  when  the  centripetal  acceleration  equals  the  gravitational  acceler- 
ation, that  is,  when 

v2 

— = 2g[l  — cos  9\  = g cos  9 

R 

This  occurs  when  cos  9 = | . This  is  an  unusual  case  where  the  Newtonian  approach  is  the  simplest. 

6.17  Example:  Rolling  solid  sphere  on  a spherical  shell 

This  is  a similar  problem  to  the  prior  one  with  the  added 
complication  of  rolling  which  is  assumed  to  move  in  a vertical 
plane  making  it  holonomic.  Here  we  would  like  to  determine 
the  forces  of  constraint  to  see  when  the  solid  sphere  flies  off  the 
spherical  shell  and  when  the  friction  is  insufficient  to  stop  the 
rolling  sphere  from  slipping. 

The  best  generalized  coordinates  are  the  distance  of  the  center 
of  the  sphere  from  the  center  of  the  spherical  shell,  r,  9 and  fi. 

It  is  important  to  note  that  <f>  is  measured  with  respect  to  the 
vertical,  not  the  time- dependent  vector  r.  That  is,  the  direction 
of  the  radius  r is  9 which  is  time  dependent  and  thus  is  not  a 
usefid  reference  to  use  to  define  the  angle  <f>.  Let  us  assume 
that  the  sphere  is  uniform  with  a moment  of  inertia  of  I = 

| mo 2 . If  the  tangential  frictional  force  F is  less  than  the  limiting 
value  Npstatics,  with  N > 0,  then  the  sphere  will  roll  without 
slipping  on  the  surface  of  the  cylinder  and  both  constraints  apply. 

Under  these  conditions  the  system  is  holonomic  and  the  solution  is  solved  using  Lagrange  multipliers  and  the 
equations  of  constraint  are  the  following: 

1 ) The  center  of  the  sphere  follows  the  surface  of  the  cylinder 

<?i  = r — R — a = 0 


Disk  of  mass  m,  radius  a,  rolling  on  a 
cylindrical  surface  of  radius  R. 


2)  The  sphere  rolls  without  slipping 


52  = a {<j>  — 9)  — R9  = 0 
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The  kinetic  energy  is  T = 
Lagrangian  is 


\m  ( 


r2  + r 

1 


t jl(j)  and  the  potential  energy  is  U = mgr  cos  9.  Thus  the 
1 _ 2 


1 / • 2\  JL  • 2 

L = -m  yr2  + r29~)  + -/</>  —mgr cos 8 


Consider  the  solution  using  Lagrange  multipliers  for  the  holonomic  regime  where  both  constraints  are 
satisfied  and  lead  to  the  following  differential  constraint  relations 


dgi 

dr 

dg2 

dr 


= 1 
= 0 


dgi 

d(j> 

dg-2 

d(j> 


= 0 


= 0 


dgi 

de 


The  Lagrange  operator  equation  A rL  gives, 


that  is 
A gL  gives 
A ^L  gives 


d dL  d L dgi  dgi 

dt  dr  dr  1 dr  2 dr 


mr  + mg  cos  9 — mr9~  = Ai 


mi'2  9 + 2mri'9  — mgr  sin  9 = — A2  ( R + a) 


I(f>  = a\i 

Since  the  center  of  the  sphere  rolling  on  the  spherical  shell  must  have 

r = R + a 


then 


Substituting  this  into  (c)  gives 
Insert  this  into  equation  (6)  gives 


r = r = 0 

y .. 

<f>  = -9 
a 


9=—t  A2 

rl 


A2  = 


mgr  sin  9 


rl 


The  moment  of  inertia  about  the  axis  of  a solid  sphere  is  I = ima2.  Then 


Xo  — 


2 mg  sin  9 
7 


But  also 


Integrating  gives 


That  is 


h-hdd  _ °2  \ - 5 \ _ 5g sin 9 
° d9  rIX'2  2 mr  2 7 r 


5 g 


9d9  = — sin  9d9 
7 r / 


02  = 12£(i  _ cos  9) 
7 r 


assuming  that  9 = 0 at  6 = 0.  Inserting  this  into  equation  (a)  gives 


(a) 

(b) 

(c) 
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That  is 

Ai  = 1-'y~  [17 cos  9 — 10] 

Note  that  this  equals  zero  when 

o 10 
cos  9 = — 

For  larger  angles  is  negative  implying  that  the  solid  sphere  will  fly  off  the  surface  of  the  spherical  shell. 

The  sphere  will  leave  the  surface  of  the  cylinder  when  cos  9 = that  is,  9 = 53.97°.  This  is  a significantly 
larger  angle  than  obtained  for  the  similar  problem  where  the  mass  is  sliding  on  a frictionless  cylinder  because 
the  energy  stored  in  rotation  implies  that  the  linear  velocity  of  the  mass  is  lower  at  a given  angle  9 for  the 
case  of  a rolling  sphere. 

The  above  discussion  has  omitted  an  important  fact  that,  if  pstatic  < °°>  the  frictional  force  becomes 
insufficient,  to  maintain  the  rolling  constraint  before  9 = 53.97°,  that  is,  the  frictional  force  will  exceed 
the  sliding  limit  N fistatic.  To  determine  when  the  rolling  constraint  fails  it  is  necessary  to  determine  the 
frictional  torque 

FfR  = — A2  R 

Thus 

Ff  = -A2 

It  is  in  the  negative  direction  because  of  the  direction  chosen  for  <f>.  The  required  coefficient  of  friction  p is 
given  by  the  ratio  of  the  frictional  force  to  the  normal  force,  that  is 


\2  2 sin  9 

AT  “ [17  cos  9 — 10] 


For  p = 1 the  disk  starts  to  slip  when  9 = 47.54°.  Note  that  the  sphere  starts  slipping  before  it  flies  off 
the  cylinder  since  a normal  force  is  required  to  support  a frictional  force  and  the  difference  depends  on  the 
coefficient,  of  friction.  The  no-slipping  constraint  is  not  satisfied  once  the  sphere  starts  slipping  and  the 
frictional  force  shoidd  equal  PkineUc^i-  Thus  for  the  angles  beyond  47.54°  the  problem  needs  to  be  solved  with 
the  rolling  constraint  changed  to  a sliding  non- conservative  frictional  force.  This  is  best  handled  by  including 
the  frictional  force  and  normal  forces  as  generalized  forces.  Fortunately  this  will  be  a small  correction.  The 
friction  will  slightly  change  the  exact  angle  at  which  the  normal  force  becomes  zero  and  the  system  transitions 
to  free  motion  of  the  sphere  in  a gravitational  field. 


6.18  Example:  Solid  sphere  rolling  plus  slipping  on  a spherical  shell 

Consider  the  above  case  when  the  frictional  force  is  insufficient  to  constrain  the  motion  to  rolling.  Now 
the  frictional  force  F is  given  by 

F N PgHding 

when  N is  positive. 

This  can  be  solved  using  generalized  forces  with  the  previous  Lagrangian.  Then 


d^dL 
dt  dr 


dL 

dr 


= Qr  = N 


which  gives 


• Z 

mf  + mg  cos  9 — mr9  = N 


Similarly  A gL  = Qg  = —F  (R  + a)  gives 


mr29  + 2 mrrO  — mgr  sin  9 = — F ( R + a) 


Similarly  A^L  = = aF  gives 

I<j>  = aF 

These  can  be  solved  by  substituting  the  relation  F = Npsliding.  The  sphere  flies  off  the  spherical  shell 
when  N < 0 leading  to  free  motion  discussed  in  example  6.2.  The  problem  of  a solid  uniform  sphere  rolling 
inside  a hollow  sphere  can  be  solved  the  same  way. 
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6.19  Example:  Small  body  held  by  friction  on  the  periphery  of  a rolling  wheel 


Assume  that  a small  body  of  mass  m is  bal- 
anced on  a rolling  wheel  of  mass  M and  radius 
R as  shown  in  the  figure.  The  wheel  rolls  in 
a vertical  plane  without  slipping  on  a horizontal 
surface.  This  example  illustrates  that  it  is  possi- 
ble to  use  simultaneously  a mixture  of  holonomic 
constraints,  partially-holonomic  constraints,  and 
generalized  forces .3 

Assume  that  at  t = 0 the  wheel  touches  the 
floor  at  x = y = 0 with  the  mass  perched  at 
the  top  of  the  wheel  at  x = 0.  Let  the  frictional 
force  acting  on  the  mass  m be  F and  the  reaction 
force  of  the  periphery  of  the  wheel  on  the  mass 
be  N.  Let  ip  be  the  angular  velocity  of  the  ivheel, 
and  x the  horizontal  velocity  of  the  center  of  the 
wheel.  The  polar  coordinates  r , 6 of  the  mass  m 
are  taken  with  r measured  from  the  center  of  the 
wheel  with  9 measured  with  respect  to  the  vertical. 
Thus  the  cartesian  coordinates  of  the  small  mass 
m are  (x  + r sin  9,R  + r cos  9)  with  respect  to  the 
origin  at  x = y = 0. 

The  kinetic  energy  is  given  by 


y 


Small  body  of  mass  m held  by  friction  on  the  periphery 
of  a rolling  wheel  of  mass  M and  radius  R. 


1,1,1 
T = -Mi2  H — Lp2  H — m 
2 2 2 


[x  + r9  cos  9 + r sin  9 


Yfl 


r cos  9 — r9  sin  9 


The  gravitational  force  can  be  absorbed  into  the  scalar  potential  term  of  the  Lagrangian  and  includes  only 
the  potential  energy  of  the  mass  m since  the  potential  energy  of  the  rolling  wheel  is  constant. 


U = +mg  ( R + r cos  9) 


Thus  the  Lagrangian  is 


L = i ( M + m)  x2  + ip2  + 


r292 


2 rx9  cos  9 + 2 xr  sin  9 + r2 


mg  ( R + r cos  9) 


The  equations  of  constraints  are: 

1 )  The  wheel  rolls  without  slipping  on  the  ground  plane  leading  to  a holonomic  constraint: 

gi  = x — Rqz  = x — Rip  = 0 


2)  The  mass  m is  touching  the  periphery  of  the  wheel,  that  is,  the  normal  force  N > 0.  This  is  a one-sided 
restricted  holonomic  constraint. 

g2  = R - r = 0 

3)  The  mass  m does  not  slip  on  the  wheel  if  the  frictional  force  F < Ngstatic.  When  this  restricted 
holonomic  constraint  is  satisfied,  then 

g3  = 9 - ip  = 0 

The  rolling  constraint  is  holonomic,  and  can  be  accounted  for  using  one  Lagrange  multiplier  Xx  plus  the 
differential  constraint  equations 


3 This  problem  is  solved  in  detail  in  example  3.19  of  " Classical  Mechanics  and  Relativity",  by  Muller-Kirsten  [Mtt06] . 
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dgi 

dx 

dgi 

de 

dgi 

dip 

dgi 

dr 


1 

0 

R 

0 


The  other  two  constraints  are  non-holonomic,  and  thus  these  constraint  forces  are  expressed  in  terms  of  two 
generalized  forces  Qg , and  Qr  that  are  related  to  the  tangential  force  F and  radial  reaction  force  N . For 
simplicity,  assume  that  the  wheel  is  a thin-walled  cylinder  with  a moment  of  inertia  of 

I = MR 2 


The  Euler- Lagrange  equations  for  the  four  coordinates  x,  9,  p,  r are 

_d_ 
dt 


M + m)  x + mr9  cos  9 + r sin  9^j  + Xx  + Qx 

= 0 

(A.) 

— mgr  sin  9 — [mr2  '9  + mrx  cos  9^j  + Qg 

= 0 

(A.) 

-jt  {MR2 ip)  - R\x 

= 0 

(A„) 

—mg  cos  9 — ( mx  sin  9 + r)  + Qr 

= 0 

(Ar) 

The  generalized  forces  can  be  related  to  F and  N using  the  definition 

dr 


Qqk  = F(r)- 


dqu 


where  F(r)  is  the  vectorial  sum  of  the  forces  acting  at  r.  The  components  of  vector  r = (x  + r sin0,  R + rcosO) 
and  F,  and  N are  in  the  directions  defined  in  the  figure  which  leads  to  the  generalized  forces 


Qx  = — Fcosd  + IV  sin  0 

Qg  = {—Fcos9  + Nsin9){—Rcos9)  — {Fsm9  + Ncos9)Rsin9  = —FR. 
Qr  = N 


Solving  the  above  7 equations  gives  that 

• 2 

mi  sin  9 + m R9  — mg  cos  9 + N = 0 

This  last  equation  can  be  derived  by  Newtonian  mechanics  from  consideration  of  the  forces  acting. 

The  above  equations  of  motion  can  be  used  to  calculate  the  motion  for  the  following  conditions. 

a)  Mass  not  slipping: 

This  occurs  if  g = < gstatic  which,  also  implies  that  N > 0,  That  is  a situation  where  the  system  is 
holonomic  with  r = R,  x = Rip , 9 = ip  which  can  be  solved  using  the  generalized  coordinate  approach  with 
only  one  independent  coordinate  which  can  be  taken  to  be  9. 

b)  Mass  slipping: 

Here  the  no-slip  constraint  is  violated  and  thus  one  has  to  explicitly  include  the  generalized  forces  Qr , Qv,  Qg 
and  assume  that  sliding  friction  is  given  by  F = Ngsliding. 

c)  Reaction  force  N is  negative: 

Here  the  mass  is  not  subject  to  any  constraints  and  it  is  in  free  fall. 


The  above  example  illustrates  the  flexibility  provided  by  Lagrangian  mechanics  that  allows  simultane- 
ous use  of  Lagrange  multipliers,  generalized  forces,  and  scalar  potential  to  handle  combinations  of  several 
holonomic  and  nonholonomic  constraints  for  a complicated  problem. 
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6.10  Velocity-dependent  Lorentz  force 

The  Lorentz  force  in  electromagnetism  is  unusual  in  that  it  is  a velocity-dependent  force,  as  well  as  being  a 
conservative  force  which  can  be  treated  using  the  concept  of  potential.  That  is,  the  Lorentz  force  is 

F = <j(E  + v x B)  (6.61) 


It  is  interesting  to  use  Maxwell’s  equations  and  Lagrangian  mechanics  to  show  that  the  Lorentz  force  can  be 
represented  by  a conservative  potential  in  Lagrangian  mechanics. 

Maxwell’s  equations  can  be  written  as 


= — (6.62) 
£o 

= 0 
= 0 

= J 

Since  V ■ B =0  then  it  follows  from  Appendix  H that  B can  be  represented  by  the  curl  of  a vector 


potential,  A,  that  is 

B = V x A 

(6.63) 

Substituting  this  into  V x E+^ 

= 0 gives  that 

_ <9V  x A 

V x Eh = 0 

dt 

Vx(E  + f)  . 0 

(6.64) 

V E 


„ „ <9B 

V x E+  — 

at 

V B 

_ T>  dE 

V x B-UqSq  — 


Since  this  curl  is  zero  it  can  be  represented  by  the  gradient  of  a scalar  potential  U 

r) 

E + — = -VU  (6.65) 

The  following  shows  that  this  relation  corresponds  to  taking  the  gradient  of  a potential  U for  the  charge  q 
where  the  potential  U is  given  by  the  relation 


U = q($-  A-v) 


(6.66) 


where  $ is  the  scalar  electrostatic  potential.  This  scalar  potential  U can  be  used  in  the  Lagrange  equations 
using  the  Lagrangian 


L = -rnv  ' v — <?($  -A-v) 


(6.67) 


The  Lorentz  force  can  be  derived  from  this  Lagrangian  by  considering  the  Lagrange  equation  for  the  cartesian 
coordinate  x 


Using  the  above  Lagrangian  (6.67)  gives 


d_dL  dL 
dt  dx  dx 


But 


mx  + q 


dAx 

dt 


<9$ 

dx 


dA 

dx 


= 0 


dAx  dAx  dAx  dAx  dAx 
~dT  = ~df  + ~d^X  + ~dfV  + !hZ 


(6.68) 


(6.69) 

(6.70) 


d A dAx  dAv  dA, 

TT~  ' v =~R—X  + 
ox  dx  dx  dx 


(6.71) 


and 
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Inserting  equations  6.70  and  6.71  into  6.69  gives 


Fx  = mx  = q 


d$  dAx 
dx  dt 


f dAy  dAx 


\ dx  dy 


y- 


dAx  dA , 


dz 


dx 


= ?[E  + vxB] 


(6.72) 


Corresponding  expressions  can  be  obtained  for  Fy  and  Fz.  Thus  the  total  force  is  the  well-known  Lorentz 
force 


F = g(E  + v x B)  (6.73) 

This  has  demonstrated  that  the  electromagnetic  scalar  potential 

U = q{$-  A ■ v)  (6.74) 

satisfies  Maxwell’s  equations,  gives  the  Lorentz  force,  and  it  can  be  absorbed  into  the  Lagrangian.  Note  that 
the  velocity-dependent  Lorentz  force  is  conservative  since  E is  conservative,  and  because  (v  x B x v)dt= 0, 
therefore  the  magnetic  force  does  no  work  since  it  is  perpendicular  to  the  trajectory.  The  velocity-dependent 
conservative  Lorentz  force  is  an  important  and  ubiquitous  force  that  features  prominently  in  many  branches 
of  science.  It  will  be  discussed  further  for  the  case  of  relativistic  motion  in  chapter  16.6. 


6.11  Time-dependent  forces 

All  examples  discussed  in  this  chapter  have  assumed  Lagrangians  that  are  time  independent.  Mathematical 
systems  where  the  ordinary  differential  equations  do  not  depend  explicitly  on  the  independent  variable,  which 
in  this  case  is  time  t,  are  called  autonomous  systems.  Systems  having  differential  equations  governing  the 
dynamical  behavior  that  have  time-dependent  coefficients  are  called  non- autonomous  systems. 

In  principle  it  is  trivial  to  incorporate  time-dependent  behavior  into  the  equations  of  motion  by  intro- 
ducing either  a time  dependent  generalized  force  Q(r,t),  or  allowing  the  Lagrangian  to  be  time  dependent. 
For  example,  in  the  rocket  problem  the  mass  is  time  dependent.  In  some  cases  the  time  dependent  forces 
can  be  represented  by  a time-dependent  potential  energy  rather  than  using  a generalized  force.  Solutions 
for  non-autonomous  systems  can  be  considerably  more  difficult  to  obtain,  and  can  involve  regions  where  the 
motion  is  stable  and  other  regions  where  the  motion  is  unstable  or  chaotic  similar  to  the  behavior  discussed 
in  chapter  4.  The  following  case  of  a simple  pendulum,  whose  support  is  undergoing  vertical  oscillatory 
motion,  illustrates  the  complexities  that  can  occur  for  systems  involving  time-dependent  forces. 


6.20  Example:  Plane  pendulum  hanging  from  a vertically- oscillating  support 

Consider  a plane  pendulum  having  a mass  M fastened  to  a massless  rigid  rod  of  length  L that  is  at  an 
angle  Oft)  to  the  vertical  gravitational  field  g.  The  pendulum  is  attached  to  a support  that  is  subject  to  a 
vertical  oscillatory  force  F such  that  the  vertical  position  y of  the  support  is 


The  kinetic  energy  is 


1 


T=  -M 
2 

and  the  potential  energy  is 
Thus  the  Lagrangian  is 


y = A cosuit 

(lO  cos  0 j + (y  + L0  sin  0) 

U = Mg  [L(  1 — cos  6)  + y\ 


= -M 
2 


. -„'2 


L~0  + 2L0y  sin  0 + y 


1 


L = -M 
2 


L20~  + 2 LOy  sin  6 + y2 


Mg  [L(  1 — cos0)  + y] 


The  Euler- Lagrange  equations  lead  to  equations  of  motion  for  0 and  y 

ML20  + MLy  sin/9  + MgL  sin 6 = 0 
MW  sin  0 + ML6  cos  0 + My  + Mg  = F 
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Assume  the  small-angle  approximation  where  9 0,  then  these  two  equations  reduce  to 


0 

F 
M 

Substitute  y = —Aw2  cos  wt  into  these  equations  gives 

6+  ^-^-coswt'je  = 0 
M (g  — Aw'2  cos  wt)  = F 

These  correspond  to  stable  harmonic  oscillations  about  9 « 0 if  the  bracket  term  is  positive,  and  to 
unstable  motion  if  the  bracket  is  negative.  Thus,  for  small  amplitude  oscillation  about  9 « 0 the  motion  of 
the  system  can  be  unstable  whenever  the  bracket  is  negative,  that  is,  when  the  acceleration  Aw2  cos  wt  > g 
and  resonance  behavior  can  occur  coupling  the  pendulum  period  and  the  forcing  frequency  w. 

This  discussion  also  applies  to  the  inverted  pendulum  with  a surprising  result.  It  is  well  known  that  the 
pendulum  is  unstable  near  9 = n.  However,  if  the  support  is  oscillating,  then  for  9 « tt  the  equations  of 
motion  become 


V + 9 = 


6- 


Aw2 


cos  wt  9 


L L 
to  (g  — Aw2  cos  wt) 


0 

F 


The  inverted  pendulum  has  stable  oscillations  about  9 « n if  the  bracket  is  negative,  that  is,  if  Aw'2  cos  wt  > g. 
This  illustrates  that  nonautonomous  dynamical  systems  can  involve  either  stable  or  unstable  motion. 


6.12  Impulsive  forces 


Colliding  bodies  often  involve  large  impulsive  forces  that  act  for  a short  time.  As  discussed  in  chapter  2.12.8, 
the  treatment  of  impulsive  forces  or  torques  is  greatly  simplified  if  they  act  for  a sufficiently  short  time  that 
the  displacement  during  the  impact  can  be  ignored,  even  though  the  instantaneous  change  in  velocities  may 
be  large.  The  simplicity  is  achieved  by  taking  the  time  integral  of  the  Euler-Lagrange  equations  over  the 
duration  t of  the  impulse  and  assuming  r — > 0. 

The  impact  of  the  impulse  on  a system  can  be  handled  two  ways.  The  first  approach  is  to  use  the 
Euler-Lagrange  equation  during  the  impulse  to  determine  the  equations  of  motion 


d_  ( dL\ 
dt  \dq3  ) 


dL  _ nExc 
dq3 


(6.75) 


where  the  impulsive  force  is  introduced  using  the  generalized  force  Qfxc.  Knowing  the  initial  conditions  at 
time  t,  the  conditions  at  the  time  t + t are  given  by  integration  of  equation  6.75  over  the  duration  r of  the 
impulse  which  gives 


(6.76) 


This  integration  determines  the  conditions  at  time  t + r which  then  are  used  as  the  initial  conditions  for  the 
motion  when  the  impulsive  force  Qfxc  is  zero. 

The  second  approach  is  to  realize  that  equation  6.76  can  rewritten  in  the  form 


lim 

T — »0 


d_ 

dt 


, dL 

dt  = lim  - 
r-> 0 Uq.j 


t+T 

t 


= APj  = lim 

T — ►(_) 


dr 


(6.77) 


Note  that  in  the  limit  that  r — *•  0 then  the  integral  of  the  generalized  momentum  pj  = simplifies  to  give 
the  change  in  generalized  momentum  A pj.  In  addition,  assuming  that  the  non-impulsive  forces  are 
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finite  and  independent  of  the  instantaneous  impulsive  force  during  the  infinitessimal  duration  r,  then  the 
contribution  of  the  non-impulsive  forces  ff+T  (§tf)  d T during  the  impulse  can  be  neglected  relative  to  the 
large  impulsive  force  term;  limT_,0  ff+T  Qfxcdr.  Thus  it  can  be  assumed  that 

rt+T 

A pj  = lim  / Qfxcdr  = Qj  (6.78) 


where  Qj  is  the  generalized  impulse  associated  with  coordinate  j = 1, 2, 3, n.  This  generalized  impulse 
can  be  derived  from  the  time  integral  of  the  impulsive  forces  P;  given  by  equation  2.135  using  the  time 
integral  of  equation  6.77,  that  is 


A pj  = Qj  = lim 


Qfxc 


dr  = lim 

T— >0 


Note  that  the  generalized  impulse  Qj  can  be  a translational  impulse  Pj  with  corresponding  translational 
variable  qj,  or  an  angular  impulsive  torque  fj  with  corresponding  angular  variable  <pj. 

Impulsive  force  problems  usually  are  solved  in  two  stages.  Either  equations  6.76  or  6.79  are  used  to 
determine  the  conditions  of  the  system  immediately  following  the  impulse.  If  r — > 0 then  impulse  changes 
the  generalized  velocities  qj  but  not  the  generalized  coordinates  qj . The  subsequent  motion  then  is  determined 
using  the  Lagrangian  equations  of  motion  with  the  impulsive  generalized  force  being  zero,  and  assuming  that 
the  initial  condition  corresponds  to  the  result  of  the  impulse  calculation. 


6.21  Example:  Series-coupled  double  pendulum  subject  to  impulsive  force 

Consider  a series- coupled  double  pendulum  comprising 
two  masses  mi  and  m2  connected  by  rigid  massless  rods  of 
lengths  L\  and  L2  as  shown  in  the  figure.  Initially  the  two 
pendula  are  at  rest  and  hanging  vertically  when  a horizontal 
impulse  P strikes  the  system  at  a distance  D below  the  up- 
per fulcrum  where  Li  < D < Li  + L2.  For  this  system  the 
kinetic  energy  of  the  masses  mi  and  m2  are 

Ti  = ^miLlfil 

1 *2  . . 2 

T2  = -m2[Llfi+2L1L2(l)1(l)2cos((j)i-(l)2)  + Ll(j)2} 

z m2 

Note  the  velocity  of  m2  is  the  vector  sum  of  the  two  velocities  Two  series-coupled  plane  pendula. 

shown,  separated  by  the  angle  (j)2  — . Thus  the  total  kinetic 

energy  is 

1 *2  . 1*2 
T = -(mi  + m2)L\(j)i  + m2I/ii2</>i</>2 cos(^i  - <t> 2 ) + 2?n2^2</)2 

To  first  order  in  cos^  — <j>2) 

1 *2  ■ • 1 *2 

T = -(mi  + + m2LiL2(j)1(j)2  + -m2L\f2 

The  total  potential  energy  is 

U = migLi(l  — coscfi)  + m-2g\Li(l  — cos(f>i)  + L2(l  — cos(j)2) 

= (mi  + m2)gLi(l  - cost/q)  + m2gL2{\  - cos^2) 

Thus,  assuming  the  small-angle  approximation,  the  Lagrangian  becomes 

L = 2 (mi  + m2)Ll(t>i  + m2LiL2(f)i<j)2  + -m2Ll<f2  - ( - (mi  + m2)gLi^\  + -m2gL2(f\ 
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Use  equation  6.79  to  transform  to  the  generalized  coordinates  <j>1  and  cf2  with  the  corresponding  generalized 
impulsive  torques 

Qi  = PL i 

Q2  = P(D-Li) 

Since  the  system  starts  at  rest  where  (p1  = </>2  = 0,  then  using  equation  6.77  gives  the  change  in  angular 
momentum  immediately  following  the  impulse  to  be 

= PL1 
= P(D-L  i) 

These  two  equations  determine  and  </>2  immediately  after  the  impulse;  these  can  be  used  with  (p1  = (f2  = 0 
as  initial  conditions  for  solving  the  subsequent  force-free  motion  when  the  generalized  impulsive  force  is  zero. 

As  described  in  example  12.5,  the  subsequent  motion  of  this  series  coupled  pendulum  will  be  a superposition 
of  the  two  normal  modes  with  amplitudes  determined  by  the  result  of  the  impulse  calculation. 

6.13  The  Lagrangian  versus  the  Newtonian  approach  to  classical 
mechanics 

It  is  useful  to  contrast  the  differences,  and  relative  advantages,  of  the  Newtonian  and  Lagrangian  formulations 
of  classical  mechanics.  The  Newtonian  force-momentum  formulation  is  vectorial  in  nature,  it  has  cause  and 
effect  embedded  in  it.  The  Lagrangian  approach  is  cast  in  terms  of  kinetic  and  potential  energies  which  involve 
only  scalar  functions  and  the  equations  of  motion  come  from  a single  scalar  function,  the  Lagrangian.  The 
directional  properties  of  the  equations  of  motion  come  from  the  requirement  that  the  trajectory  is  specified 
by  the  principle  of  least  action.  The  directional  properties  of  the  vectors  in  the  Newtonian  approach  assist 
in  our  intuition  when  setting  up  a problem,  but  the  Lagrangian  method  is  simpler  mathematically  when  the 
mechanical  system  becomes  more  complex. 

The  major  advantage  of  the  variational  approaches  to  mechanics  is  that  solution  of  the  dynamical  equa- 
tions of  motion  can  be  simplified  by  expressing  the  motion  in  terms  of  independent  generalized  coordi- 
nates. These  generalized  coordinates  can  be  any  set  of  independent  variables,  qi,  where  1 < i < n, 
plus  the  corresponding  velocities  <ji  for  Lagrangian  mechanics.  These  independent  generalized  coordinates 
completely  specify  the  scalar  potential  and  kinetic  energies  used  in  the  Lagrangian  or  Hamiltonian.  The  vari- 
ational approach  allows  for  a much  larger  arsenal  of  possible  generalized  coordinates  than  the  typical  vector 
coordinates  used  in  Newtonian  mechanics.  For  example,  the  generalized  coordinates  can  be  dimensionless 
amplitudes  for  the  N normal  modes  of  coupled  oscillator  systems,  or  action-angle  variables.  Moreover,  very 
different  generalized  coordinates  can  be  used  for  each  of  the  n variables.  The  tremendous  freedom  plus 
flexibility  of  the  choice  of  generalized  coordinates  is  important  when  constraint  forces  are  acting  on  the 
system.  Generalized  coordinates  allow  the  constraint  forces  to  be  ignored  by  including  auxiliary  conditions 
to  account  for  the  kinematic  constraints  that  lead  to  correlated  motion.  The  Lagrange  method  provides 
an  incredibly  consistent  and  mechanistic  problem-solving  strategy  for  many-body  systems  subject  to  con- 
straints. Expressed  in  terms  of  generalized  coordinates,  the  Lagrange’s  equations  can  be  applied  to  a wide 
variety  of  physical  problems  including  those  involving  fields.  The  manipulation  of  scalar  quantities  in  a 
configuration  space  of  generalized  coordinates  can  greatly  simplify  problems  compared  with  being  confined 
to  a rigid  orthogonal  coordinate  system  characterized  by  the  Newtonian  vector  approach. 

The  use  of  generalized  coordinates  in  Lagrange’s  equations  of  motion  can  be  applied  to  a wide  range 
of  physical  phenomena  including  field  theory,  such  as  for  electromagnetic  fields  which  are  beyond  the  ap- 
plicability of  Newton’s  equations  of  motion.  The  superiority  of  the  Lagrangian  approach  compared  to  the 
Newtonian  approach  for  solving  problems  in  mechanics  is  apparent  when  dealing  with  liolonomic  constraint 
forces.  Constraint  forces  must  be  known  and  included  explicitly  in  the  Newtonian  equations  of  motion.  Un- 
fortunately knowledge  of  the  equations  of  motion  is  required  to  derive  these  constraint  forces.  For  holonomic 
constrained  systems,  the  equations  of  motion  can  be  solved  directly  without  calculating  the  constraint  forces 
using  the  minimal  set  of  generalized  coordinate  approach  to  Lagrangian  mechanics.  Moreover,  the  Lagrange 
approach  has  significant  philosophical  advantages  compared  to  the  Newtonian  approach. 


+ m2Li  L1(j>1  + L2cf> 2 


m2L-2  ( L± (/q  + L2cj) 2) 
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Newtonian  plausibility  argument  for  Lagrangian  mechanics: 

A justification  for  introducing  the  calculus  of  variations  to  classical  mechanics  becomes  apparent  when 
the  concept  of  the  Lagrangian  L = T — U is  used  in  the  functional  and  time  t is  the  independent  variable. 
It  was  shown  that  Newton’s  equation  of  motion  can  be  rewritten  as 


d_dL_  8L_ 
dt  dqi  3 q.i 


Kx 


(6.12) 


where  Fffx  are  the  excluded  forces  of  constraint  plus  any  other  conservative  or  non-conservative  forces  not 
included  in  the  potential  U.  This  corresponds  to  the  Euler-Lagrange  equation  for  determining  the  minimum 
of  the  time  integral  of  the  Lagrangian. 

The  excluded  force  F^x  can  be  partitioned  into  the  holonomic  constraint  part  F^c  which  can  be 
represented  by  the  Lagrange  multipliers  term. 


pqHC  = 


k 


dgk 

dqi 


(6.14) 


Thus  the  excluded  forces  Fxx  can  be  separated  into  the  Lagrange  multiplier  terms  plus  any  remaining 
excluded  forces  F®xc . That  is, 

771  f) 

Fff  = J2  (t)  ^ + Ffxc  (6.13,6.14) 

k °qi 

Thus  equation  6.12  can  be  written  as 


d dL  dL 
dt  dqi  dqi 


y 


\ (A  _i_  rEXC 

Xk{t)Wi+  qi 


(6.15) 


where  the  Lagrange  multiplier  term  accounts  for  holonomic  constraint  forces,  and  F^xc  includes  all  addi- 
tional forces  not  accounted  for  by  the  scalar  potential  U,  or  the  Lagrange  multiplier  terms  F^c . As  discussed 
in  chapter  6.6.3,  the  constraint  forces  can  be  included  explicitly  as  generalized  forces  in  the  excluded  term 
Fq.xc  of  equation  6.15. 

Note  that  for  unconstrained  pure  conservative  forces,  equation  6.15  can  be  simplified  to  the  Euler-Lagrange 
equation  for  N independent  coordinates  g*. 


dL  _ dL_ 
dt  dcp  dqi 


(6.16) 


This  is  equivalent  to  using  the  calculus  of  variations  to  minimize  the  action  integral  S = ft  2 Ldt,  that  is 


5S  = 5 L{qi,qi]t)dt  = 0 (6-17) 

Jti 

where  the  functional  is  the  Lagrangian  and  the  independent  variable  is  time  t. 
d’Alembert’s  Principle 
It  was  shown  that  d’Alembert’s  Principle 


N 

X)(F t ^ Pi) ' = 0 (6.25) 

i 

cleverly  transforms  the  principle  of  virtual  work  from  the  realm  of  statics  to  dynamics.  Application  of  virtual 
work  to  statics  primarily  leads  to  algebraic  equations  between  the  forces,  whereas  d’Alembert’s  principle 
applied  to  dynamics  leads  to  differential  equations. 

Lagrange  equations  of  motion 

Lagrange  used  d’Alembert’s  Principle  to  derived  the  basic  equations  of  Lagrangian  mechanics.  This  proof 
clearly  illustrates  the  role  of  the  calculus  of  variations  in  Lagrangian  mechanics  as  well  as  elucidating  the 
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role  of  forces  in  the  theory.  The  d’Alembert  Principle  leads  to  Euler’s  variational  equation  for  the  kinetic 
energy  plus  the  active  forces  Qj  for  each  coordinate  j. 


y 


Qj 


Sqj  = 0 


(6.38) 


If  the  N coordinates  qj  are  independent  then  for  each  value  of  j the  square  bracket  equals  zero  which 
corresponds  to  Euler’s  equation. 

The  Lagrangian  method  concentrates  solely  on  active  forces,  completely  ignoring  all  other  internal  forces. 
In  Lagrangian  mechanics  the  generalized  forces,  corresponding  to  each  generalized  coordinate,  can  be  parti- 
tioned three  ways 

m f) 

Q j = -VU  + J2  Afc^(q,  t)  + Qfxc 
k= i aqi 

where  the  velocity-independent  conservative  forces  can  be  absorbed  into  a scalar  potential  U,  the  holonomic 
constraint  forces  can  be  handled  using  the  Lagrange  multiplier  term  Y^k=i  t),  and  the  remaining 

part  of  the  active  forces  can  be  absorbed  into  the  generalized  force  Qfxc.  The  scalar  potential  energy  U is 
handled  by  absorbing  it  into  the  standard  Lagrangian  L = T — U . If  the  constraint  forces  are  holonomic  then 
these  forces  are  easily  and  elegantly  handled  by  use  of  Lagrange  multipliers.  All  remaining  forces,  including 
dissipative  forces,  can  be  handled  by  including  them  explicitly  in  the  the  generalized  force  Qfxc. 
Combining  the  above  two  equations  gives 


iri  r\ 


Sqj  = 0 


(6.56) 


Use  of  the  Lagrange  multipliers  to  handle  the  m constraint  forces  ensures  that  all  N infinitessimals  Sqj  are 
independent  implying  that  the  expression  in  the  square  bracket  must  be  zero  for  each  of  the  N values  of  j. 
This  leads  to  N Lagrange  equations  plus  m constraint  relations 


a 

\ dt 


= Qfxc 


m 0 


k=i 


(6.60) 


where  j = 1, 2, 3,  ...N. 

Application  of  Lagrangian  mechanics: 

The  optimal  way  to  exploit  Lagrangian  mechanics  is  as  follows: 

1.  Select  a set  of  independent  generalized  coordinates. 

2.  Partition  the  active  forces  into  three  groups: 

(a)  Conservative  one-body  forces 

(b)  Holonomic  constraint  forces 

(c)  Generalized  forces 

3.  Minimize  the  number  of  generalized  coordinates. 

4.  Derive  the  Lagrangian 

5.  Derive  the  equations  of  motion 

Velocity-dependent  Lorentz  force: 

Usually  velocity-dependent  forces  are  non-holonomic.  However,  electromagnetism  is  a special  case  where 
the  velocity-dependent  Lorentz  force  F = q(E  + v x B)  can  be  obtained  from  a velocity-dependent  potential 
function  U(q,q,t)-  It  was  shown  that  the  velocity-dependent  potential 


U = q$  — qv  ■ A 


(6.74) 
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leads  to  the  Lorentz  force  where  <I>  is  the  scalar  electric  potential  and  A the  vector  potential. 

Time-dependent  forces: 

It  was  shown  that  time-dependent  forces  can  lead  to  complicated  motion  having  both  stable  regions  and 
unstable  regions  of  motion  that  can  exhibit  chaos. 

Impulsive  forces: 

A generalized  impulse  Qj  can  be  derived  for  an  instantaneous  impulsive  force  from  the  time  integral  of 
the  impulsive  forces  Pi  given  by  equation  2.135  using  the  time  integral  of  equation  6.17,  that  is 


A pj  = Qj  = lim 


dr  ee  lim  ft+T  F i • P- dr  = 

T^oJt  ^ dqj  dq3 


Q 


,EXC 


(6.79) 


Note  that  the  generalized  impulse  Qj  can  be  a translational  impulse  Pj  with  corresponding  translational 
variable  qj  or  an  angular  impulsive  torque  Tj  with  corresponding  angular  variable  <fij. 

Comparison  of  Newtonian  and  Lagrangian  mechanics: 

In  contrast  to  Newtonian  mechanics,  which  is  based  on  knowing  all  the  vector  forces  acting  on  a system, 
Lagrangian  mechanics  can  derive  the  equations  of  motion  using  generalized  coordinates  without  requiring 
knowledge  of  the  constraint  forces  acting  on  the  system.  Lagrangian  mechanics  provides  a remarkably 
powerful,  and  incredibly  consistent,  approach  to  solving  for  the  equations  of  motion  in  classical  mechanics 
which  is  especially  powerful  for  handling  systems  that  are  subject  to  holonomic  constraints. 
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Workshop  exercises 

1.  A disk  of  mass  M and  radius  R rolls  without  slipping  down  a plane  inclined  from  the  horizontal  by  an  angle 
a.  The  disk  has  a short  weightless  axle  of  negligible  radius.  From  this  axis  is  suspended  a simple  pendulum  of 
length  l < R and  whose  bob  has  a mass  m.  Assume  that  the  motion  of  the  pendulum  takes  place  in  the  plane 
of  the  disk. 

(a)  What  generalized  coordinates  would  be  appropriate  for  this  situation? 

(b)  Are  there  any  equations  of  constraint?  If  so,  what  are  they? 

(c)  Find  Lagrange’s  equations  for  this  system. 


2.  A Lagrangian  for  a particular  system  can  be  written  as 

TTl  K. 

L = — ( a x2  + 2 bxy  + cy2)  — — (ax2  ^ ^ c y2) 

where  a,  b , and  C are  arbitrary  constants,  but  subject  to  the  condition  that  6“  — 4ac  7^  0. 

(a)  What  are  the  equations  of  motion? 

(b)  Examine  the  case  a = 0 = c.  What  physical  system  does  this  represent? 

(c)  Examine  the  case  6 = 0 and  a = — C.  What  physical  system  does  this  represent? 

(d)  Based  on  your  answers  to  (b)  and  (c),  determine  the  physical  system  represented  by  the  Lagrangian  given 
above. 


3.  Consider  a particle  of  mass  to  moving  in  a plane  and  subject  to  an  inverse  square  attractive  force. 


(a)  Obtain  the  equations  of  motion. 

(b)  Is  the  angular  momentum  about  the  origin  conserved? 

(c)  Obtain  expressions  for  the  generalized  forces.  Recall  that  the  generalized  forces  are  defined  by 


Qj  = E 

i 


dxj 

dqj' 


4.  Consider  a Lagrangian  function  of  the  form  L[qi,qi,qi,t ).  Here  the  Lagrangian  contains  a time  derivative 
of  the  generalized  coordinates  that  is  higher  than  the  first.  When  working  with  such  Lagrangians,  the  term 
“generalized  mechanics”  is  used. 


(a)  Consider  a system  with  one  degree  of  freedom.  By  applying  the  methods  of  the  calculus  of  variations, 
and  assuming  that  Hamilton’s  principle  holds  with  respect  to  variations  which  keep  both  q and  q fixed  at 
the  end  points,  show  that  the  corresponding  Lagrange  equation  is 

^_(d£\_±(d£\ 

d?  [fti ) Jt \dt ) + ~ ■ 


Such  equations  of  motion  have  interesting  applications  in  chaos  theory, 
(b)  Apply  this  result  to  the  Lagrangian 


Do  you  recognize  the  equations  of  motion? 


5. 


A bead  of  mass  to  slides  under  gravity  along  a smooth  wire  bent  in  the  shape  of  a parabola  x2  = az  in  the 
vertical  ( x , Z ) plane. 


(a)  What  kind  (holonomic,  nonholonomic,  scleronomic,  rheonomic)  of  constraint  acts  on  ml 

(b)  Set  up  Lagrange’s  equation  of  motion  for  x with  the  constraint  embedded. 
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(c)  Set  up  Lagrange’s  equations  of  motion  for  both  x and  z with  the  constraint  adjoined  and  a Lagrangian 
multiplier  A introduced. 

(d)  Show  that  the  same  equation  of  motion  for  x results  from  either  of  the  methods  used  in  part  (b)  or  part 
(c). 

(e)  Express  A in  terms  of  X and  X. 

(f)  What  are  the  X and  2 components  of  the  force  of  constraint  in  terms  of  x and  X ? 

6.  Consider  the  two  Lagrangians 

L(q,q;t)  and  L'(q,  q;  t)  = L(q,  q;  t)  + dF<y^  ^ 

where  F(q,t)  is  an  arbitrary  function  of  the  generalized  coordinates  q(t).  Show  that  these  two  Lagrangians 
yield  the  same  Euler-Lagrange  equations.  As  a consequence  two  Lagrangians  that  differ  only  by  an  exact  time 
derivative  are  said  to  be  equivalent. 

7.  Consider  the  double  pendulum  comprising  masses  mi  and  m2  connected  by  inextensible  strings  as  shown  in 
the  figure.  Assume  that  the  motion  of  the  pendulum  takes  place  in  a vertical  plane. 


(a)  Are  there  any  equations  of  constraint?  If  so,  what  are  they? 

(b)  Find  Lagrange’s  equations  for  this  system. 


8 Consider  the  system  shown  in  the  figure  which  consists  of  a mass  m suspended  via  a constrained  massless  link 
of  length  L where  the  point  A is  acted  upon  by  a spring  of  spring  constant  k.  The  spring  is  unstretched  when 
the  massless  link  is  horizontal.  Assume  that  the  holonomic  constraints  at  A and  B are  frictionless. 

a Derive  the  equations  of  motion  for  the  system  using  the  method  of  Lagrange  multipliers. 


— . 8x 


9 Consider  a pendulum,  with  mass  m,  connected  to  a (horizontally)  moveable  support  of  mass  M. 

(a)  Determine  the  Lagrangian  of  the  system. 

(b)  Determine  the  equations  of  motion  for  9 <C  1. 

(c)  Find  an  equation  of  motion  in  9 alone.  What  is  the  frequency  of  oscillation? 

(d)  What  is  the  frequency  of  oscillation  for  M m?  Does  this  make  sense? 
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Problems 

1.  A sphere  of  radius  p is  constrained  to  roll  without  slipping  on  the  lower  half  of  the  inner  surface  of  a hollow 
cylinder  of  radius  R.  Determine  the  Lagrangian  function,  the  equation  of  constraint,  and  the  Lagrange  equations 
of  motion.  Find  the  frequency  of  small  oscillations. 

2.  A particle  moves  in  a plane  under  the  influence  of  a force  / = — Ar“_1  directed  toward  the  origin;  A and 
a (>  0)  are  constants.  Choose  generalized  coordinates  with  the  potential  energy  zero  at  the  origin. 

a)  Find  the  Lagrangian  equations  of  motion. 

b)  Is  the  angular  momentum  about  the  origin  conserved? 

c)  Is  the  total  energy  conserved? 

3.  Two  blocks,  each  of  mass  A I,  are  connected  by  an  extensionless,  uniform  string  of  length  l.  One  block  is  placed 
on  a frictionless  horizontal  surface,  and  the  other  block  hangs  over  the  side,  the  string  passing  over  a frictionless 
pulley.  Describe  the  motion  of  the  system: 

a)  when  the  mass  of  the  string  is  negligible 

b)  when  the  string  has  mass  m. 

4.  Two  masses  m\  and  m2  (m\  7^  m2)  are  connected  by  a rigid  rod  of  length  d and  of  negligible  mass.  An 
extensionless  string  of  length  l\  is  attached  to  m\  and  connected  to  a fixed  point  of  the  support  P.  Similarly 
a string  of  length  I2  {l\  7^  I2)  connects  m2  and  P.  Obtain  the  equation  of  motion  describing  the  motion  in 
the  plane  of  mi,  m2,  and  P,  and  find  the  frequency  of  small  oscillation  around  the  equilibrium  position. 

5.  A thin  uniform  rigid  rod  of  length  2 L and  mass  M is  suspended  by  a massless  string  of  length  l.  Initially  the 
system  is  hanging  vertically  downwards  in  the  gravitational  field  g.  Use  as  generalized  coordinates  the  angles 
given  in  the  diagram. 

a)  Derive  the  Lagrangian  for  the  system. 

b)  Use  the  Lagrangian  to  derive  the  equations  of  motion. 

c)  A horizontal  impulsive  force  Fx  in  the  X direction  strikes  the  bottom  end  of  the  rod  for  an  infinitessimal 
time  r.  Derive  the  initial  conditions  for  the  system  immediately  after  the  impulse  has  occurred. 

d)  Draw  a diagram  showing  the  geometry  of  the  pendulum  shortly  after  the  impulse  when  the  displacement 
angles  are  significant. 
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Symmetries,  Invariance  and  the 
Hamiltonian 

7.1  Introduction 

The  discussion  of  Lagrangian  dynamics  illustrates  the  power  of  Lagrangian  mechanics  for  deriving  the  equa- 
tions of  motion.  In  contrast  to  Newtonian  mechanics,  which  is  given  in  terms  of  force  vectors  acting  on  a 
system,  the  Lagrangian  method,  based  on  d’Alembert’s  Principle  or  Hamilton’s  Principle,  is  expressed  in 
terms  of  the  scalar  kinetic  and  potential  energies  of  the  system.  The  Lagrangian  approach  is  a sophisticated 
alternative  to  Newton’s  laws  of  motion,  that  provides  a simpler  derivation  of  the  equations  of  motion  that 
allows  constraint  forces  to  be  ignored.  In  addition,  the  use  of  Lagrange  multipliers  or  generalized  forces 
allows  the  Lagrangian  approach  to  determine  the  constraint  forces  when  these  forces  are  of  interest.  The 
equations  of  motion,  derived  either  from  Newton’s  Laws  or  Lagrangian  dynamics,  can  be  non-trivial  to 
solve  mathematically.  It  is  necessary  to  integrate  second-order  differential  equations,  which  for  n degrees  of 
freedom,  imply  2 n constants  of  integration. 

Chapter  7 will  explore  the  remarkable  connection  between  symmetry  and  invariance  of  a system  under 
transformation,  and  the  related  conservation  laws  that  imply  the  existence  of  constants  of  motion.  Even 
when  the  equations  of  motion  cannot  be  solved  easily,  it  is  possible  to  derive  important  physical  principles 
regarding  the  first-order  integrals  of  motion  of  the  system  directly  from  the  Lagrange  equation,  as  well  as 
elucidating  the  underlying  symmetries  plus  invariance.  This  property  is  contained  in  Noether’s  theorem 
which  states  that  conservation  laws  are  associated  with  differentiable  symmetries  of  a physical  system. 


7.2  Generalized  momentum 

Consider  a holonomic  system  of  N masses  under  the  influence  of  conservative  forces  that  depend  on  position 
q.j  but  not  velocity  qj,  that  is,  the  potential  is  velocity  independent.  Then  for  the  x coordinate  of  particle  i 
for  N particles 


dL  dT  dU  _ dT 

d±i  d±i  d±i  d±i 

Ft  N i 

= foil  2™*  (*%+$  + %) 

1 i= 1 

TTliXi  — Pi,x 

Thus  for  a holonomic,  conservative,  velocity-independent  potential  we  have 

8L 

. . . Pi,x 

UXi 

which  is  the  x component  of  the  linear  momentum  for  the  ith  particle. 


(7.1) 


(7.2) 
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This  result  suggests  an  obvious  extension  to  the  concept  of  momentum  to  generalized  coordinates.  The 
generalized  momentum  associated  with  the  coordinate  qj  is  defined  to  be 


dL 

vp-  =Pj 
dQj 


(7.3) 


Note  that  p3  also  is  called  the  conjugate  momentum  or  canonical  momentum  to  qj  where  q3 , p3  are 
conjugate,  or  canonical,  variables.  Remember  that  the  linear  momentum  p3  is  the  first-order  time  integral 
given  by  equation  2.10.  If  Qj  is  not  a spatial  coordinate,  then  p3  is  the  generalized  momentum,  not  the 
kinematic  linear  momentum.  For  example,  if  q3  is  an  angle,  then  p3  will  be  angular  momentum.  That 
is,  the  generalized  momentum  may  differ  from  the  usual  linear  or  angular  momentum  since  the  definition 
(7.3)  is  more  general  than  the  usual  definition  of  momentum,  px  = mx,  in  classical  mechanics.  This  is 
illustrated  by  the  case  of  a moving  charged  particles  m3,e3  in  an  electromagnetic  field.  Chapter  6 showed 
that  electromagnetic  forces  on  a charge  e3  can  be  described  in  terms  of  a scalar  potential  U3  where 


Uj  = e3(&  — A ■ Vj) 

Thus  the  Lagrangian  for  the  electromagnetic  force  can  be  written  as 

N 

l=i 


-rrijVj  ■ Vj  — ej($  — A ■ Vj) 


(7.4) 


(7.5) 


The  generalized  momentum  to  the  coordinate  x3  for  charge  e3 , and  mass  rrij , is  given  by  the  above  Lagrangian 

Pj,  x = ~qX~  = n'ljXj  + &jAx  (1-6) 

Note  that  this  includes  both  the  mechanical  linear  momentum  plus  the  correct  electromagnetic  momentum. 
The  fact  that  the  electromagnetic  field  carries  momentum  should  not  be  a surprise  since  electromagnetic 
waves  also  carry  energy  as  is  illustrated  by  the  radiant  energy  from  the  sun. 


7.1  Example:  Feynman’s  angular-momentum  paradox 

Feynman[Fey8f]  posed  the  following  paradox.  A circular  insulating  disk , mounted  on  frictionless  bearings, 
has  a circular  ring  of  total  charge  q uniformly  distributed  around  the  perimeter  of  the  circular  disk  at  the 
radius  R.  A superconducting  long  solenoid  of  radius  s,  where  s < R,  is  fixed  to  the  disk  and  is  mounted 
coaxial  with  the  bearings.  The  moment  of  inertia  of  the  system  about  the  rotation  axis  is  I.  Initially  the  disk 
plus  superconducting  solenoid  are  stationary  with  a steady  current  producing  a uniform  magnetic  field  B0 
inside  the  solenoid.  Assume  that  a rise  in  temperature  of  the  solenoid  destroys  the  superconductivity  leading 
to  a rapid  dissipation  of  the  electric  current  and  resultant  magnetic  field.  Assume  that  the  system  is  free  to 
rotate,  no  other  forces  or  torques  are  acting  on  the  system,  and  that  the  charge  carriers  in  the  solenoid  have 
zero  mass  and  thus  do  not  contribute  to  the  angular  momentum.  Does  the  system  rotate  when  the  current  in 
the  solenoid  stops? 

Initially  the  system  is  stationary  with  zero  mechanical  angu- 
lar momentum.  Faraday’s  Law  states  that,  when  the  magnetic 
field  dissipates  from  B0  to  zero,  there  will  be  a torque  N acting 
on  the  circumferential  charge  q at  radius  R due  to  the  change 
in  magnetic  flux  <f>. 

N<‘>  = 'I* IS 

Since  ^ < 0,  this  torque  leads  to  an  angular  impulse  which 
will  equal  the  final  mechanical  angular  momentum. 

L/ lnSH  = T=J-N(t)dt  = qR* 
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The  initial  angular  momentum  in  the  electromagnetic  field  can  he  derived  using  equation  7.6,  plus  Stoke’s 
theorem  (Appendix  H 3).  Equation  2.142  gives  that  the  final  angular  momentum  equals  the  angular  impulse 


i EM  _ d 
^ initial  1 ' 


rp^dldt  = R(f)  rp^dl  = qR(p  A^dl  = qR  B dS  =qRA> 


where  <3>  = (p  A^dl  = / B • dS  is  the  initial  total  magnetic  flux  through  the  solenoid.  Thus  the  total  initial 


• TOTAL 
J initial 


= o 


^irMial  = dU’b 


angular  momentum  is  given  by 
Since  the  final  electromagnetic  field  is  zero  the  final  total  angular  momentum  is  given  by 

^°TalAL=Er°sH+o=qR* 


Note  that  the  total  angidar  momentum  is  conserved.  That  is,  initially  all  the  angidar  momentum  is  stored  in 
the  electromagnetic  field,  whereas  the  final  angular  momentum  is  all  mechanical.  This  explains  the  paradox 
that  the  mechanical  angular  momentum  is  not  conserved,  only  the  total  angular  momentum  of  the  system  is 
conserved,  that  is,  the  sum  of  the  mechanical  and  electromagnetic  angular  momenta. 


7.3  Invariant  transformations  and  Noether’s  Theorem 


One  of  the  great  advantages  of  Lagrangian  mechanics  is  the  freedom  it  allows  in  choice  of  generalized 
coordinates  which  can  simplify  derivation  of  the  equations  of  motion.  For  example,  for  any  set  of  coordinates, 
q.j , a reversible  point  transformation  can  define  another  set  of  coordinates  q(  such  that 

<l'j  = Qj(Qi,Q2,-qn-,t)  (7.7) 

The  new  set  of  generalized  coordinates  satisfies  Lagrange’s  equations  of  motion  with  the  new  Lagrangian 

L(q',q',t)  = L(q,q,t)  (7.8) 


The  Lagrangian  is  a scalar,  with  units  of  energy,  which  does  not  change  if  the  coordinate  representa- 
tion is  changed.  Thus  L(q',q',t ) can  be  derived  from  L(q,q,t)  by  substituting  the  inverse  relation  (p  = 
qi(q[,  q’2,  ~cfn',  t)  into  L(q,q,t).  That  is,  the  value  of  the  Lagrangian  L is  independent  of  which  coordinate 
representation  is  used.  Although  the  general  form  of  Lagrange’s  equations  of  motion  is  preserved  in  any 
point  transformation,  the  explicit  equations  of  motion  for  the  new  variables  usually  look  different  from  those 
with  the  old  variables.  A typical  example  is  the  transformation  from  cartesian  to  spherical  coordinates. 
For  a given  system,  there  can  be  particular  transformations  for  which  the  explicit  equations  of  motion  are 
the  same  for  both  the  old  and  new  variables.  Transformations  where  the  equations  of  motion  are  invariant 
are  called  invariant  transformations.  It  will  be  shown  that  if  the  Lagrangian  does  not  explicitly  contain 
a particular  coordinate  of  displacement  q-i , then  the  corresponding  conjugate  momentum,  p,,  is  conserved. 
This  relation  is  called  Noether’s  theorem  which  states  uFor  each  symmetry  of  the  Lagrangian,  there  is  a 
conserved  quantity". 

Noether’s  Theorem  will  be  used  to  consider  invariant  transformations  for  two  dependent  variables,  x(t), 
and  9(t),  plus  their  conjugate  momenta  px  and  pg.  For  a closed  system,  these  provide  up  to  six  possible 
conservation  laws  for  the  three  axes.  Then  we  will  discuss  the  independent  variable  t,  and  its  relation  to 
the  Generalized  Energy  Theorem,  which  provides  another  possible  conservation  law.  For  simplicity,  these 
discussions  assume  that  the  systems  are  holonomic  and  conservative. 

The  Lagrange  equations  using  generalized  coordinates  for  holonomic  systems,  was  given  by  equation  6.60 
to  be 


dgk 

dqj 


(q,  t)  + Q 


EXC 

3 


(7.9) 


This  can  be  written  in  terms  of  the  generalized  momentum  as 


f d_  dL 

\ dt^  dqj 


dgk 

dqj 


(q  ,t)  + Q 


EXC 

3 


(7.10) 
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or  equivalently  as 


m 


9gk 

dq.j 


(q  H)  + QfXC 


(7.11) 


Note  that  if  the  Lagrangian  L does  not  contain  qi  explicitly,  that  is,  the  Lagrangian  is  invariant  to  a linear 
translation,  or  equivalently,  is  spatially  homogeneous,  and  if  the  Lagrange  multiplier  constraint  force  and 
generalized  force  terms  are  zero,  then 


dL_ 

dqj 


in  o 

5>fc^(q,t)+QfXG 

dq> 


= 0 


(7.12) 


In  this  case  the  Lagrange  equation  reduces  to 

* = f = 0 <713> 

Equation  7.13  corresponds  to  pj  being  a constant  of  motion.  Stated  in  words,  the  generalized  momentum  pi 
is  a constant  of  motion  if  the  Lagrangian  is  invariant  to  a spatial  translation  of  qi,  and  the  constraint  plus 
generalized  force  terms  are  zero.  Expressed  another  way,  if  the  Lagrangian  does  not  contain  a given  coordi- 
nate qi  and  the  corresponding  constraint  plus  generalized  forces  are  zero,  then  the  generalized  momentum 
associated  with  this  coordinate  is  conserved.  Note  that  this  example  of  Noether’s  theorem  applies  to  any 
component  of  q.  For  example,  in  the  uniform  gravitational  field  at  the  surface  of  the  earth,  the  Lagrangian 
does  not  depend  on  the  x and  y coordinates  in  the  horizontal  plane,  thus  px  and  py  are  conserved,  whereas, 
due  to  the  gravitational  force,  the  Lagrangian  does  depend  on  the  vertical  z axis  and  thus  pz  is  not  conserved. 


7.2  Example:  Atwoods  machine 


Assume  that  the  linear  momentum  is  conserved  for  the  Atwood’s  machine  shown  in  the  adjacent  figure. 
Let  the  left  mass  rise  a distance  x and  the  right  mass  rise  a distance  y.  Then  the  middle  mass  must  drop 
by  x + y to  conserve  the  length  of  the  string.  The  Lagrangian  of  the  system  is 


L = ^m)x2+^(3m)(-x-y)2+^my2 


-(Amgx  + 3 mg(—x  — y)  + mgy)  = -mx2 +3mxy+2my 


•2 


-mg{x-2y) 


Note  that  the  transformation 


x = xo  + 2e 
V = Vo  + e 

results  in  the  potential  energy  term  mg(x—2y)  = mg(xo—2yo) 
which  is  a constant  of  motion.  As  a result  the  Lagrangian 
is  independent  of  e,  which  means  that  it  is  invariant  to  the 
small  perturbation  e,  and  thus  £k  = o.  Therefore,  accord- 
ing to  Noether’s  theorem,  the  corresponding  linear  momen- 
tum Pe  = is  conserved.  This  conserved  linear  momentum 
then  is  given  by 


Example  of  an  Atwood’s  machine 


„ dL  dL  dx  dL  dy  . . . , ..  , „ „ 

Pe  = fff  = ~gf~gf  + ~g^fff=  m(7x  + 3y)(2)  +m(3x  + 4y)  = m(17x  + lOy) 


Thus,  if  the  system  starts  at  rest  with  Pe  = 0,  then  x always  equals  — j^y  since  Pe  is  constant. 

Note  that  this  also  can  be  shown  using  the  Euler- Lagrange  equations  in  that  A XL  = 0 and  A yL  = 0 give 


7 mx  + 3 my  = —mg 
3 mi  + 4 my  = 2 mg 


Adding  the  second  equation  to  twice  the  first  gives 

17  mx  + 10  my  = -^-(17  mx  + 10  my)  = 0 
dt 

This  is  the  result  obtained  directly  using  Noether’s  theorem. 
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7.4  Rotational  invariance  and  conservation  of  angular  momentum 

The  arguments,  used  above,  apply  equally  well  to  conjugate  momenta  pg  and  9 for  rotation  about  any  axis. 
The  Lagrange  equation  is 

= + ^ (7-14) 

J k= 1 

If  no  constraint  or  generalized  torques  act  on  the  system,  then  the  right-hand  side  of  equation  7.14  is  zero. 
Moreover  if  the  Lagrangian  in  not  an  explicit  function  of  9 , then  = 0,  and  assuming  that  the  constraint 
plus  generalized  torques  are  zero,  then  pg  is  a constant  of  motion. 

Noether’s  Theorem  illustrates  this  general  result  which  can  be  stated  as,  if  the  Lagrangian  is  rotationally 
invariant  about  some  axis,  then  the  component  of  the  angular  momentum  along  that  axis  is  conserved.  Also 
this  is  true  for  the  more  general  case  where  the  Lagrangian  is  invariant  to  rotation  about  any  axis,  which 
leads  to  conservation  of  the  total  angular  momentum. 


7.3  Example:  Conservation  of  angular  momentum  for  rotational  invariance: 


The  Noether  theorem  residt  for  rotational-invariance  about  an 
axis  also  can  be  derived  using  cartesian  coordinates  as  shown  below. 
As  discussed  in  appendix  D,  it  is  necessary  to  limit  discussion  of 
rotation  to  infinitessimal  rotation  angles  in  order  to  represent  the 
rotation  by  a vector.  Consider  an  infinitessimal  rotation  69  about 
some  axis,  which  is  a vector.  As  illustrated  in  the  adjacent  figure, 
this  can  be  expressed  as 


6r  = 69  x r 

The  velocity  vectors  also  change  on  rotation  of  the  system  obeying 
the  transformation  equation  which  is  common  to  all  vectors,  that 
is, 

6r  = 69  x f 

If  the  Lagrangian  is  unaffected  by  the  orientation  of  the  system, 
that  is,  it  is  rotationally  invariant,  then  it  can  be  shown  that  the 
angular  momentum  is  conserved.  For  example,  consider  that  the 
Lagrangian  is  invariant  to  rotation  about  some  axis  g.j.  Since  the 
Lagrangian  is  a function 

L — Li,qi,  <ji,  L) 


then  the  expression  that  the  Lagrangian  does  not  change  due  to  an  infinitesimal  rotation  69  about  this  axis 
can  be  expressed  as 


dL_ 

dxi 


fT  TTT-dii  = 0 


(A) 


l 


l 


where  cartesian  coordinates  have  been  used. 
Using  the  generalized  momentum 


then,  Lagrange’s  equation  gives 


that  is 


d_  dL 
dtPl  dx 


Pi  = 


dL 

dxi 


3 3 

6L  = y^pdxj  + '^2pi5±i  = 0 


Inserting  this  into  equation  A gives 
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This  is  equivalent  to  the  scalar  products 

p • 8r  + p • Sr  = 0 

For  an  infinitessimal  rotation  59, then  5r  = 59  x r and  5r  = 59  x r . Therefore 

p ■ {56  x r)  + p • {59  x r)  = 0 
The  cyclic  order  can  be  permuted  giving 

56  • (r  x p)  + 50  • (r  x p)  = 0 
56  • [(r  x p)  + (r  x p)]  = 0 

50  ■ 4-  (r  x p)  = 0 
dt  y ’ 

Because  the  infinitessimal  angle  59  is  arbitrary,  then  the  time  derivative 

4(rxp)  = ° 

about  the  axis  of  rotation  59.  But  the  bracket  (r  x p)  equals  the  angular  momentum.  That  is; 

Angular  momentum  = (r  x p)  = constant 

This  proves  the  Noether  ’ theorem  that  the  angular  momentum  about  any  axis  is  conserved  if  the  Lagrangian 
is  rotationally  invariant  about  that  axis. 

7.4  Example:  Diatomic  molecules  and  axially- symmetric  nuclei 

An  interesting  example  of  Noether’s  theorem  applies  to  diatomic  molecules  such  as  H2,N2,F2,02,Cl2 
and  Br2-  The  electric  field  produced  by  the  two  charged  nuclei  of  the  diatomic  molecule  has  cylindrical 
symmetry  about  the  axis  through  the  two  nuclei.  Electrons  are  bound  to  this  dumbbell  arrangement  of  the  two 
nuclear  charges  which  may  be  rotating  and  vibrating  in  free  space.  Assuming  that  there  are  no  external  torques 
acting  on  the  diatomic  molecule  in  free  space,  then  the  angular  momentum  about  any  fixed  axis  in  free  space 
must  be  conserved  according  to  Noether’s  theorem.  If  no  external  torques  are  applied,  then  the  component  of 
the  angular  momentum  about  any  fixed  axis  is  conserved,  that  is,  the  total  angidar  momentum  is  conserved. 
What  is  especially  interesting  is  that  since  the  electrostatic  potential,  and  thus  the  Lagrangian,  of  the  diatomic 
molecule  has  cylindrical  symmetry,  that  is  ^ = 0,  then  the  component  of  the  angular  momentum  with  respect 
to  this  symmetry  axis  also  is  conserved  irrespective  of  how  the  diatomic  molecule  rotates  or  vibrates  in  free 
space.  That  is,  an  additional  symmetry  has  been  identified  that  leads  to  an  additional  conservation  law  that 
applies  to  the  angular  momentum. 

An  example  of  Noether ’s  theorem  is  in  nuclear  physics  where  some  nuclei  have  a spheroidal  shape  similar 
to  an  american  football  or  a rugby  ball.  This  spheroidal  shape  has  an  axis  of  symmetry  along  the  long  axis. 
The  Lagrangian  is  rotationally  invariant  about  the  symmetry  axis  resulting  in  the  angular  momentum  about 
the  symmetry  axis  being  conserved  in  addition  to  conservation  of  the  total  angular  momentum. 


7.5  Cyclic  coordinates 


Translational  and  rotational  invariance  occurs  when  a system  has  a cyclic  coordinate  qk.  A cyclic  coordinate 
is  one  that  does  not  explicitly  appear  in  the  Lagrangian.  The  term  cyclic  is  a natural  name  when  one  has 
cylindrical  or  spherical  symmetry.  In  Hamiltonian  mechanics  a cyclic  coordinate  often  is  called  an  ignorable 
coordinate.  By  virtue  of  Lagrange’s  equations 


d dL  dL 
dt  dqk  dqk 


then  a cyclic  coordinate  qk,  is  one  for  which  = 0.  Thus 

d dL  _ . 
dt  dqk  Pk 


(7.15) 


(7.16) 


that  is,  pk  is  a constant  of  motion  if  the  conjugate  coordinate  qk  is  cyclic.  This  is  just  Noether’s  Theorem. 
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7.6  Kinetic  energy  in  generalized  coordinates 

Application  of  Noether’s  theorem  to  the  conservation  of  energy  requires  the  kinetic  energy  to  be  expressed 
in  generalized  coordinates.  In  terms  of  fixed  rectangular  coordinates,  the  kinetic  energy  for  N bodies,  each 
having  three  degrees  of  freedom,  is  expressed  as 

1 N 3 

r = 2 (7-17) 

a=l  i= 1 

These  can  be  expressed  in  terms  of  generalized  coordinates  as  xa ^ = xa^{qj,t)  and  in  terms  of  generalized 
velocities 

®xa,i  ■ . 9xa  i . . 

= g -gj-H  + — (718) 

Taking  the  square  of  xa  { and  inserting  into  the  kinetic  energy  relation  gives 


rnr  • ,\  ^ dxa  i dxa  i . . \ ^ \ d Xa  i dxa  i . \ — \ ^ 1 f dxa  i 

T(q,q,  *)  = L L 2m°  Sq,  Aft  W + ^ flt  + E L 2m‘ 

a i,j,k  J cx.  i,j  J a i x 


This  can  be  abbreviated  as 


T(q,  q,  t)  = T2( q,  q,  t)  + T\ (q,  q,t)  +T0(q,t) 


where 


. . ^ ^ — -v  1 dxa  i dxa  { ^ ' 

T2 (q,  q,  t)  = 2 TOa  ~d^~d^~qjQk  = Y a:ik1j(Ik 

a.  i,j,k  •i  k j,k 

m \ - \ dxa  i dxa  i . \ ' . 

^i(q,q,t)  = Oj=YbM 


“ 8q,  dt 

a i,j  J 

™ q.O  - EE*-(%^ 


where 


aJk  ~YYo 


1 dxa.t  dxa}i 
2m“  dq0  dqk 


When  the  transformed  system  is  scleronomic,  time  does  not  appear  explicitly  in  the  transformation 
equations  to  generalized  coordinates  since  dXQt,i  = 0.  Then  T\  = To  = 0,  and  the  kinetic  energy  reduces  to 
a homogeneous  quadratic  function  of  the  generalized  velocities 

r(q,q,i)=T2(q,q,i)  (7.25) 

A useful  relation  can  be  derived  by  taking  the  differential  of  equation  7.21  with  respect  to  qi.  That  is 

<9T2(q,  q,  t)  . ( ^ 

= 2^  alk<Ik  + 2_^  am  (7-26) 

k j 

Multiply  this  by  qi  and  sum  over  l gives 


• d»T2(q,  q,  t)  ^ ^ • • 0y^  • • <yr 

2 Q.i — ttt = 2_ aikOkqi  + 2_,  amrn  = 2 2^  aikOkqi  = 2 t2 

l ® k,l  j,l  j,k 

Similarly,  the  products  of  the  generalized  velocities  q,  with  the  corresponding  derivatives  of  T\  and  To  give 

= 2T2  (7-27) 

i dqi 
• dTi(q,  q,  t) 

Yqi — w- — = Ti(q’q^)  (7-28) 


. 9Tq  (q,  t) 


0 


(7.29) 
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Equation  7.25  gives  that  T = T2  when  the  transformed  system  is  scleronomic,  i.e.  dx£''  = 0 and  then  the 
kinetic  energy  is  a quadratic  function  of  the  generalized  velocities  q-j . Using  the  definition  of  the  generalized 
momentum  equation  7.3,  assuming  T = T2,  and  that  the  potential  U is  velocity  independent,  gives  that 


Pi  = 


dL  _ dT  dU 

dqi  dqi  dqt 

Then  equation  7.27  reduces  to  the  useful  relation  that 


dqi 


t2  = \ E QiPi  = hi  ‘ P 


(7.30) 


(7.31) 


where,  for  compactness,  the  summation  is  abbreviated  as  a scalar  product. 


7.7  Generalized  energy  and  the  Hamiltonian  function 


Consider  the  time  derivative  of  the  Lagrangian,  plus  the  fact  that  time  is  the  independent  variable  in  the 
Lagrangian.  Then  the  total  time  derivative  is 


dL 

dt 


^ dL  . dL  .. 

E^+Eg^ 


dL 

~9t 


(7.32) 


The  Lagrange  equations  for  a conservative  force  are  given  by  equation  6.60  to  be 


d dL 
dt  dqj 


dL_ 

dqj 


rn  o 


(7.33) 


The  holonomic  constraints  can  be  accounted  for  using  the  Lagrange  multiplier  terms  while  the  generalized 
force  Qfxc  includes  non-holonomic  forces  or  other  forces  not  included  in  the  potential  energy  term  of  the 
Lagrangian,  or  holonomic  forces  not  accounted  for  by  the  Lagrange  multiplier  terms. 

Substituting  equation  7.33  into  equation  7.32  gives 


<L L 
dt 


sr  ■ —— 

^dtWo 


E^ 


Qfxc 


+ EAfc 

k= 1 


dgu 

dqj 


(qU) 


v-  dL  .. 

? d'h  'h 


dL 

~dt 


Qfxc 


+EAfc 

k= 1 


dgk 

dqj 


(q  ,t) 


dL 

~dt 


This  can  be  written  in  the  form 


d_ 

dt 


-L 


Qfxc 


171  f) 

+E^.<) 

^ dqj 


fe= 1 


dL 

~dt 


Define  Jacobi’s  Generalized  Energy1  /i(q,  q,  t)  by 


h(q,q,t) 


L{  q,q,  t ) 


(7.34) 


(7.35) 


(7.36) 


Jacobi’s  generalized  momentum,  equation  7.3,  can  be  used  to  express  the  generalized  energy  h(q,q,t ) in 
terms  of  the  canonical  coordinates  (ji  and  Pi,  plus  time  t.  Define  the  Hamiltonian  function  to  equal  the 
generalized  energy  expressed  in  terms  of  the  conjugate  variables  {q:npj),  that  is, 

H (q,P,f)  = h(q,q,t)  = E = E ^pj)  - E(q,q ,t)  (7.37) 

This  Hamiltonian  H (q,  p,t)  underlies  Hamiltonian  mechanics  which  plays  a profoundly  important  role  in 
most  branches  of  physics  as  illustrated  in  chapters  8, 14  and  17. 

1Most  textbooks  call  the  function  h( q,  q,  t)  Jacobi’s  energy  integral.  This  book  adopts  the  more  descriptive  name  Generalized 
energy  in  analogy  with  use  of  generalized  coordinates  q and  generalized  momentum  p. 
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7.8  Generalized  energy  theorem 


The  Hamilton  function,  7.37  plus  equation  7.35  lead  to  the  generalized  energy  theorem 


dH  (q,  p ,t) 
dt 


dh(  q,  q,  t) 
dt 


Qfxc 


+ £* 


fc=l 


9gk , 

k-K—  (q,f) 

dqj 


dL( q,  q,  t) 
dt 


(7.38) 


Note  that  for  the  special  case  where  all  the  external  forces 


Qfxc 


+ Er=iAfc|f(q.t) 


= 0,  then 


dH_  _ _8L 
dt  dt 


(7.39) 


Thus  the  Hamiltonian  is  time  independent  if  both 


QfXC  + Ek= 


= 0 and  the  Lagrangian  are 


B 3 1 Z-^k=l''Kdqj 

time-independent.  For  an  isolated  closed  system  having  no  external  forces  acting,  then  the  Lagrangian  is 
time  independent  because  the  velocities  are  constant,  and  there  is  no  external  potential  energy.  That  is,  the 
Lagrangian  is  time-independent,  and 


d_ 
dt 

As  a consequence,  the  Hamiltonian  H (q,  p ,t) , and  generalized  energy  h( q,  q,  t),  both  are  constants  of  motion 
if  the  Lagrangian  is  a constant  of  motion,  and  if  the  external  non-potential  forces  are  zero.  This  is  an  example 
of  Noether’s  theorem,  where  the  symmetry  of  time  independence  leads  to  conservation  of  the  conjugate 
variable,  which  is  the  Hamiltonian  or  Generalized  energy. 


EH 


dL_ 

' dqj 


— L 


dH_  _ _dL  _Q 
dt  dt 


(7.40) 


7.9  Generalized  energy  and  total  energy 

The  generalized  kinetic  energy,  equation  7.20,  can  be  used  to  write  the  generalized  Lagrangian  as 


T(q,q,f)  = T2(q,q,f)  + 7i(q,q,t)  + T0(q,t)  - U(q,t) 
If  the  potential  energy  U does  not  depend  explicitly  on  velocities  Qi  or  time,  then 

8L  _d{T-U)  dT 


Pj 


dqj 


dqj  dqj 

Equation  7.42  can  be  used  to  write  the  Hamiltonian  as 

» «■.*>  - £ HI)  + £ Ht)  + £ ^ * *) 


dqj  J 


dqj  J 


Using  equations  7.27, 7.28, 7.29  gives  that  the  total  generalized  Hamiltonian  H (q,  p,i)  equals 
H (q,  p,t)  = 2 T2  + Ti  - (T2  + Ti  + T0  - U)  = T2  - T0  + U 


(7.41) 


(7.42) 


(7.43) 


(7.44) 


But  the  sum  of  the  kinetic  and  potential  energies  equals  the  total  energy.  Thus  equation  7.44  can  be  rewritten 
in  the  form 

H (q,  p,i)  = (T  + U)  - (7\  + 2T0)  =E-(T1+  2 T0)  (7.45) 

Note  that  Jacobi’s  generalized  energy  and  the  Hamiltonian  do  not  equal  the  total  energy  E.  However,  in 
the  special  case  where  the  transformation  is  scleronomic,  then  T\  = Tq  = 0,  and  if  the  potential  energy  U 
does  not  depend  explicitly  of  qi,  then  the  generalized  energy  (Hamiltonian)  equals  the  total  energy,  that  is, 
H = E.  Recognition  of  the  relation  between  the  Hamiltonian  and  the  total  energy  facilitates  determining 
the  equations  of  motion. 
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7.10  Hamiltonian  invariance 

Chapters  7.8,  7.9  addressed  two  important  and  independent  features  of  the  Hamiltonian  regarding:  a)  when 
H is  conserved,  and  b ) when  H equals  the  total  mechanical  energy.  These  important  results  are  summarized 
below  with  a discussion  of  the  assumptions  made  in  deriving  the  Hamiltonian,  as  well  as  the  implications. 


a)  Conservation  of  generalized  energy 

The  generalized  energy  theorem  (7.38)  was  given  as 


cLH  (q,p,i)  dh(q,q,t) 


dt 


dt 


= '52* 


m r\ 

«fYC+5> 


dL{ q,  q,  t) 
dt 


Note  that  when  qj  \Qfxc  + ]T™=i  t) 


k—1  ~ ^ 

= 0,  then  equation  7.46  reduces  to 


dH_  _ DL 
dt  dt 


(7.46) 


(7.47) 


Also,  when  ]T  q:) 


Qfxc  + EZU  **§£(**) 


= 0,  and  if  the  Lagrangian  is  not  an  explicit  function  of  time, 
then  the  Hamiltonian  is  a constant  of  motion.  That  is,  H is  conserved  if,  and  only  if,  the  Lagrangian,  and 
consequently  the  Hamiltonian,  are  not  explicit  functions  of  time,  and  if  the  external  forces  are  zero. 


b)  The  generalized  energy  and  total  energy 

If  the  following  two  requirements  are  satisfied 

1)  The  kinetic  energy  has  a homogeneous  quadratic  dependence  on  the  generalized  velocities,  that  is,  the 
transformation  to  generalized  coordinates  is  independent  of  time,  dXQt'%  = 0. 

2)  The  potential  energy  is  not  velocity  dependent,  thus  the  terms  = 0. 

Then  equation  7.45  implies  that  the  Hamiltonian  equals  the  total  mechanical  energy,  that  is, 

H = T + U = E (7.48) 

Expressed  in  words,  the  generalized  energy  (Hamiltonian)  equals  the  total  energy  if  the  constraints  are 
time  independent  and  the  potential  energy  is  velocity  independent.  This  is  equivalent  to  stating  that,  if  the 
constraints,  or  generalized  coordinates,  for  the  system  are  time  independent,  then  H = E. 

The  four  combinations  of  the  above  two  independent  conditions,  assuming  that  the  external  forces  term 
in  equation  7.46  is  zero,  are  summarized  in  table  7.1. 


Table  7.1:  Hamiltonian  and  total  energy 


Hamiltonian 

Constraints  and  coordinate  transformation 

Time  behavior 

Time  independent 

Time  dependent 

o 

II 

II 

5N 

H conserved,  H = E 

H conserved,  H ^ E 

o 

II 

5N 

H not  conserved,  H = E 

H not  conserved,  H ^ E 

Note  the  following  general  facts  regarding  the  Lagrangian  and  the  Hamiltonian. 

(1)  the  Lagrangian  is  indefinite  with  respect  to  addition  of  a constant  to  the  scalar  potential, 

(2)  the  Lagrangian  is  indefinite  with  respect  to  addition  of  a constant  velocity, 

(3)  there  is  no  unique  choice  of  generalized  coordinates. 

(4)  the  Hamiltonian  is  a scalar  function  that  is  derived  from  the  Lagrangian  scalar  function. 

(5)  the  generalized  momentum  is  derived  from  the  Lagrangian. 

These  facts,  plus  the  ability  to  recognize  the  conditions  under  which  H is  conserved,  and  when  H = E, 
can  greatly  facilitate  solving  problems  as  shown  by  the  following  two  examples. 
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7.5  Example:  Linear  harmonic  oscillator  on  a cart  moving  at  constant  velocity 


Consider  a linear  harmonic  oscillator  on  a cart  moving  with 
constant  velocity  vo  in  the  x direction  shown  in  the  adjacent 
figure.  Let  the  laboratory  frame  be  the  unprimed  frame  and  the 
cart  frame  be  designated  the  primed  frame.  Assume  that  x = x' 
at  t = 0.  Then 

x'  = X — Vo  t X,'  = X — Vo  x'  = x 

The  harmonic  oscillator  will  have  a potential  energy  of 

U = ifcr'2  = ifc  (x  - v0 tf 

Laboratory  frame:  The  Lagrangian  is 

r / . > mx2  1 , . . 2 

L\x,  x,  t)  = — -fc(x-  vot) 

Lagrange  equation  A XL  = 0 gives  the  equation  of  motion  to  be 

mx  = —k(x  — vot) 


Harmonic  oscillator  on  cart  moving  at 
uniform  velocity  Vq. 


The  definition  of  generalized  momentum  gives 

dL 

P = vrv  = rnx 
ox 


The  Hamiltonian  is 

. . ^ — "v  . dL  p2  1 . 2 

H (x,  p.  t)  = ^ ®—  - L = — + -kix  - v„t) 

i 

The  Hamiltonian  is  the  sum  of  the  kinetic  and  potential  energies  and  equals  the  total  energy  of  the  system, 
but  it  is  not  conserved  since  L and  H are  both  explicit  functions  of  time,  that  is  ^ ^ — 0. 

Physically  this  is  understood  in  that  energy  must  flow  into  and  out  of  the  external  constraint  keeping  the  cart 
moving  uniformly  at  a constant  velocity  Vo  against  the  reaction  to  the  oscillating  mass.  That  is,  assuming 
a uniform  velocity  for  the  moving  cart  constitutes  a time- dependent  constraint  on  the  mass,  and  the  force  of 
constraint  does  work  in  actual  displacement  of  the  complete  system.  If  the  constraint  did  not  exist,  then  the 
cart  momentum  would  oscillate  such  that  the  total  momentum  of  cart  plus  spring  system  is  conserved. 

Cart  frame:  Transform  the  Lagrangian  to  the  primed  coordinates  in  the  moving  frame  of  reference, 
which  also  is  an  inertial  frame.  Then  the  Lagrangian  L,  in  terms  of  the  moving  cart  frame  coordinates,  is 


L(x' , x' , t)  = ^ ( x '2  + 2x'vq  + Vq)  — ^ kx ,2 


The  Lagrange  equation  of  motion  A X>L  = 0 gives  the  equation  of  motion  to  be 


mx'  = —kx' 


where  x'  is  the  displacement  of  the  mass  with  respect  to  the  cart. . This  implies  that  an  observer  on  the 
cart  will  observe  simple  harmonic  motion  as  is  to  be  expected  from  the  principle  of  equivalence  in  Galilean 
relativity. 

The  definition  of  the  generalized  momentum  gives  the  linear  momentum  in  the  primed  frame  coordinates 
to  be 


, 9L 

p = — — = mx  + mv  o 
ax' 


The  cart-frame  Hamiltonian  also  can  be  expressed  in  terms  of  the  coordinates  in  the  moving  frame  to  be 


2m 


1 »«-)  TYl  o 

+ ~kx2  - —vl 


Note  that  the  Lagrangian  and  Hamiltonian  expressed  in  terms  of  the  coordinates  in  the  cart  frame  of  reference 
are  not  explicitly  time  dependent,  therefore  H is  conserved.  However,  the  cart-frame  Hamiltonian  does  not 
equal  the  total  energy  since  the  coordinate  transformation  is  time  dependent.  Actually  the  first  two  terms  in 
the  above  Hamiltonian  are  the  energy  of  the  harmonic  oscillator  in  the  cart  frame.  This  example  shows  that 
the  Hamiltonians  differ  when  expressed  in  terms  of  either  the  laboratory  or  cart,  frames  of  reference 
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7.6  Example:  Isotropic  central  force  in  a rotating  frame 


Consider  a mass  subject  to  a central  isotropic  radial 
force  U (r)  as  shown  in  the  adjacent  figure.  Compare  the 
Hamiltonian  H in  the  fixed  frame  of  reference  S,  with 
the  Hamiltonian  H'  in  a frame  of  reference  S'  which 
is  rotating  about  the  center  of  the  force  with  constant 
angular  velocity  u.  Restrict  this  case  to  rotation  about 
one  axis  so  that  only  two  polar  coordinates  r and  p need 
to  be  considered.  The  transformations  are 

r’  = r 
<f>  = p — uit 


Also 

U(r)  = U(r') 

Fixed  frame  of  reference  S: 


z 


Mass  subject  to  radial  force 


L = T-U=j(ia  + r2p2^j  - U{r) 


Since  the  Lagrangian  is  not  explicitly  time  dependent,  then  the  Hamiltonian  is  conserved.  For  this  fixed-frame 
Hamiltonian  the  generalized  momenta  are 


Pc/> 

Pr 


dL 

dip 

(TL 

dr 


= mr 


= mr 


The  Hamiltonian  equals 


H{pr,p<t>,r,  <t>) 


dL_ 

dqi 


-L  = 


1 

2 TO 


P<t>- 


+ U(r)  = E 


The  Hamiltonian  in  the  fixed  frame  is  conserved  and  equals  the  total  energy,  that  is  H = T + U . 

Rotating  frame  of  reference  S' 

The  above  inertial  fixed-frame  Lagrangian  can  be  written  in  terms  of  the  primed  (non-inertial  rotating 
frame)  coordinates  as 

L = T-U  =™(r2  + r2^2)  - U(r ) = y (r12  + r'2  (f'  + w)  2")  - U(r') 


The  generalized  momenta  derived  from  this  Lagrangian  are 


P<j, 


dL 


mr 


,/2 


d(j> 

dL  „2 
-Qfi  =mr  =p, 


(j)  + = p'p  + mr'2io 


pr  = 

The  Hamiltonian  expressed  in  terms  of  the  non-inertial  rotating  frame  coordinates  is 


dL | dL  -j 
dip 


H'ip’rtP'^r,  <p)  = ^r’  + -L  = 


Note  that  H' (p'r,p'^,r' ,(p')  is  time  independent  and  therefore  is  conserved,  but  H(p'r,p'^,r' ,(p')  f E because 
the  generalized  coordinates  are  time  dependent.  In  addition,  p'^,  is  conserved  since 

dH  dL  n 
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7.7  Example:  The  plane  pendulum 

The  simple  plane  pendulum  in  a uniform  gravita- 
tional field  g is  an  example  that  illustrates  Hamiltonian 

invariance.  There  is  only  one  generalized  coordinate,  0 VZZZZZZZZZZZZZZZZZZZZZZZZ1 

and  the  Lagrangian  for  this  system  is  \ 

1 2 . 2 e\ 

L = -ml  9 + mgl  cos  9 \ 

g Y 

The  momentum  conjugate  to  9 is  I \ 

dL  12  h 1 \ 

d9 

which  is  the  angidar  momentum  about  the  pivot  point.  The  plane  pendulum  constrained  to  oscillate  in  a 
Using  the  Lagrange- Euler  equation  this  gives  that  vertical  plane  in  a uniform  gravitational  field. 

d . dL 
?.  = »=s  = -m#Is,n» 

Note  that  the  angidar  momentum  pg  is  not  a constant  of  motion  since  it  explicitly  depends  on  9. 

The  Hamiltonian  is 


H = ^Pifi 


L = ^ ml2  9 


mgl  cos  9 


— mgl  cos  9 


Note  that  the  Lagrangian  and  Hamiltonian  are  not  explicit  functions  of  time,  therefore  they  are  conserved. 
Also  the  potential  is  velocity  independent  and  there  is  no  coordinate  transformation,  thus  the  Hamiltonian 
equals  the  total  energy  E,  which  is  a constant  of  motion. 


mgl  cos  9 = E 


7.8  Example:  Oscillating  cylinder  in  a cylindrical  bowl 

It  is  important  to  correctly  account  for  constraint  forces  when  us- 
ing Noether’s  theorem  for  constrained  systems.  Noether’s  theorem  as- 
sumes the  variables  are  independent.  This  is  illustrated  by  considering 
the  example  of  a solid  cylinder  rolling  in  a fixed  cylindrical  bowl.  As- 
sume that  a uniform  cylinder  of  radius  p and  mass  m is  constrained 
to  roll  without  slipping  on  the  inner  surface  of  the  lower  half  of  a hol- 
low cylinder  of  radius  R.  The  motion  is  constrained  to  ensure  that 
the  axes  of  both  cylinders  remain  parallel  and  p < R. 

The  generalized  coordinates  are  taken  to  be  the  angles  9 and  </> 
which  are  measured  with  respect  to  a fixed  vertical  axis.  Then  the 
kinetic  energy  and  potential  energy  are 

1 r -i 2 1-2 

T = -m  (R  — p)  9 +-/(/>  U = [i?  — (R  — p)  cos  9}  mg 

where  m is  the  mass  of  the  small  cylinder  and  where  U = 0 at  the  lowest  position  of  the  sphere.  The  moment 
of  inertia  of  a uniform  cylinder  is  I = \mp2 . 

The  Lagrangian  is 

1 r . i 2 i 

L — T — U = -m  (R  — p)  9 + -mp  f — [i?  — (R  — p)  cos  9]  mg 

Since  the  solid  cylinder  rotates  without  slipping  inside  the  cylindrical  shell,  then  the  equation  of  constraint  is 

g{94>)  = R9  — p ((f)  + 9)  = 0 
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Using  the  Lagrangian,  plus  the  one  equation  of  constraint,  requires  one  Lagrange  multiplier.  Then  the 
Lagrange  equations  of  motion  for  9 and  are 


dL  d 

' 8L~ 

89  dt 

.80. 

8L  d 

~ 8L~ 

8<f>  dt 

.dip. 

. dg 
+ Xd9 
. dg 
+ Xdf> 


0 


0 


Substitute  the  Lagrangian  and  the  equation  of  constraint  gives  two  equations  of  motion 

\2  'A 


— {R  — p)  mg  sin  9 — m{R  — p)  9 + A {R  — p)  = 

1 o " \ 

• -Xp  = 


~2mp 


The  lower  equation  of  motion  gives  that 


A = 


Substitute  this  into  the  equation  of  constraint  gives 

A = — -m  (R  — p)  9 

Substitute  this  into  the  first  equation  of  motion  gives  the  equation  of  motion  for  9 to  be 

h 2 g 


3 (R-p) 


sin# 


that  is 


\ m9  ■ t 
A = — — sm  t 


The  torque  acting  on  the  small  cylinder  due  to  the  frictional  force  is 


Fp  = ]^m.p2'<j) 


-A  p 


Thus  the  frictional  force  is 


F = — A = sin# 

O 


Noether’s  theorem  can  be  used  to  ascertain  if  the  angular  momentum  pg  is  a constant  of  motion.  The 
derivative  of  the  Lagrangian 

d l 

— = (R- p)mgsm9 

and  thus  the  Lagrange  equations  tells  us  that  pg  = (R  — p)  mg  sin  9.  Therefore  pg  is  not  a constant  of  motion. 
The  Lagrangian  is  not  an  explicit  function  of  <f>,  which  would  suggest  that  p $ is  a constant  of  motion. 


Bid  this  is  incorrect  because  the  constraint  equation  </>  = 


_ (R-p) 


9 couples  9 and  </>,  that  is,  they  are  not 


independent  variables,  and  thus  pg  and  p ^ are  coupled  by  the  constraint  equation.  As  a result  p<j>  is  not  a 
constant  of  motion  because  it  is  directly  coupled  to  pg  = (R  — p)  mg  sin  9 which  is  not  a constant  of  motion. 
Thus  neither  pg  nor  p^  are  constants  of  motion.  This  illustrates  that  one  must  account  carefully  for  equations 
of  constraint,  and  the  concomitant  constraint  forces,  when  applying  Noether’s  theorem  which  tacitly  assumes 
independent  variables. 

The  Hamiltonian  can  be  derived  using  the  generalized  momenta 

dL  m Ah 
pg  = -T-  = m(R  — p)  9 

o9 

dL  1 2A 

= TV 


Then  the  Hamiltonian  is  given  by 


H = 


-L  = 


Pe 


+ ——^2  + [R  - (R-  p)  cos 9\  mg 


2m  (R  — py  rap “ 

Note  that  the  transformation  to  generalized  coordinates  is  time  independent  and  the  potential  is  not  velocity 
dependent,  thus  the  Hamiltonian  also  equals  the  total  energy.  Also  the  Hamiltonian  is  conserved  since 


dH=Q 
dt  u- 
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7.11  Hamiltonian  for  cyclic  coordinates 


It  is  interesting  to  discuss  the  properties  of  the  Hamiltonian  for  cyclic  coordinates  qk  for  which  = 0. 
Ignoring  the  external  and  Lagrange  multiplier  terms, 


dL  dH  n 

Pk  = = 0 

oqk  aqk 


(7.49) 


That  is,  a cyclic  coordinate  has  a constant  corresponding  momentum  pk  for  the  Hamiltonian  as  well  as 
for  the  Lagrangian.  Conversely,  if  a generalized  coordinate  does  not  occur  in  the  Hamiltonian,  then  the 
corresponding  generalized  momentum  is  conserved.  Cyclic  coordinates  were  discussed  earlier  when  discussing 
symmetries  and  conservation-law  aspects  of  the  Lagrangian.  For  example,  if  the  Lagrangian,  or  Hamiltonian 
do  not  depend  on  a linear  coordinate  x.  then  px  is  conserved.  Similarly  for  9 and  pg.  An  extension  of  this 
principle  has  been  derived  for  the  relationship  between  time  independence  and  total  energy  of  a system, 
that  is,  the  Hamiltonian  equals  the  total  energy  if  the  transformation  to  generalized  coordinates  is  time 
independent  and  the  potential  is  velocity  independent. 

A valuable  feature  of  the  Hamiltonian  formulation  is  that  it  allows  elimination  of  cyclic  variables  which 
reduces  the  number  of  degrees  of  freedom  to  be  handled.  As  a consequence,  cyclic  variables  are  called 
ignorable  variables  in  Hamiltonian  mechanics.  For  example,  consider  that  the  Lagrangian  has  one  cyclic 
variable  qn.  As  a consequence,  the  Lagrangian  does  not  depend  on  qn , and  thus  it  can  be  written  as 
L = L(qi, ...,  qn-i;  qi, ...,  qn',t).  The  Lagrangian  still  contains  n generalized  velocities,  thus  one  still  has  to 
treat  n degrees  of  freedom  even  though  one  degree  of  freedom  q.n  is  cyclic.  However,  in  the  Hamiltonian 
formulation,  only  n — 1 degrees  of  freedom  are  required  since  the  momentum  for  the  cyclic  degree  of  freedom 
is  a constant  pn  = a.  Thus  the  Hamiltonian  can  be  written  as  H = H(qi, ...,  qn-i',Pi,  ■ ■■■,Pn-T,  or,  t)  , that  is, 
the  Hamiltonian  includes  only  n — 1 degrees  of  freedom.  Thus  the  dimension  of  the  problem  has  been  reduced 
by  one  since  the  conjugate  cyclic  (ignorable)  variables  ( qn,Pn ) are  eliminated.  Hamiltonian  mechanics  can 
significantly  reduce  the  dimension  of  the  problem  when  the  system  involves  several  cyclic  variables.  This  is 
in  contrast  to  the  situation  for  the  Lagrangian  approach  as  discussed  in  chapters  8 and  14. 


7.12  Symmetries  and  invariance 

This  chapter  has  shown  that  the  symmetries  of  a system  lead  to  invariance  of  physical  quantities  as  was  pro- 
posed by  Noether.  The  symmetry  properties  of  the  Lagrangian  can  lead  to  the  conservation  laws  summarized 
in  table  7.2. 


Table  7.2:  Symmetries  and  conservation  laws  in  classical  mechanics 


Symmetry 

Lagrange  property 

Conserved  quantity 

Spatial  invariance 

Translational  invariance 

Linear  momentum 

Spatial  homogeneous 

Rotational  invariance 

Angular  momentum 

Time  invariance 

Time  independence 

Total  energy 

The  importance  of  the  relations  between  invariance  and  symmetry  cannot  be  overemphasized.  It  extends 
beyond  classical  mechanics  to  quantum  physics  and  field  theory.  For  a three-dimensional  closed  system, 
there  are  three  possible  constants  for  linear  momentum,  three  for  angular  momentum,  and  one  for  energy.  It 
is  especially  interesting  in  that  these,  and  only  these,  seven  integrals  have  the  property  that  they  are  additive 
for  the  particles  comprising  a system,  and  this  occurs  independent  of  whether  there  is  an  interaction  among 
the  particles.  That  is,  this  behavior  is  obeyed  by  the  whole  assemble  of  particles  for  finite  systems.  Because 
of  its  profound  importance  to  physics,  these  relations  between  symmetry  and  invariance  are  used  extensively. 

7.13  Hamiltonian  in  classical  mechanics 

The  Hamiltonian  was  defined  by  equation  7.37  during  the  discussion  of  time  invariance  and  energy  conserva- 
tion. The  Hamiltonian  is  of  much  more  profound  importance  to  physics  than  implied  by  the  ad  hoc  definition 
given  by  equation  7.37.  This  relates  to  the  fact  that  the  Hamiltonian  is  written  in  terms  of  the  fundamental 
coordinate  q * and  its  generalized  momentum  pi  defined  by  equation  7.3. 
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It  is  more  convenient  to  write  the  n generalized  coordinates  qi,  plus  their  generalized  momentum  pi , as 
vectors,  e.g.  q = {q-\ , q-2,  ■■qn)>  P = {pi,P2,  ■ ■Pn)-  The  generalized  momenta  conjugate  to  the  coordinate  qi, 
defined  by  7.3,  then  can  be  written  in  the  form 


Pi  = 


dL{ q,  q,t) 
dqi 


(7.50) 


Substituting  this  definition  of  the  generalized  momentum  into  the  Hamiltonian  defined  in  (7.37),  and 
expressing  it  in  terms  of  the  coordinate  q and  its  conjugate  generalized  momenta  p,  leads  to 


H (q,  p,  t) 


5>«  - T(q-q  ,t) 


(7.51) 


= p • q— L(q,  q,  t) 


(7.52) 


Note  that  the  scalar  product  p q = JV  PiQi  equals  2 T for  systems  that  are  scleronomic  and  when  the 
potential  is  velocity  independent. 

The  crucial  feature  of  the  Hamiltonian  is  that  it  is  expressed  as  H (q,  p,  t) , that  is,  it  is  a function 
of  the  n generalized  coordinates  q and  their  conjugate  momenta  p,  which  are  taken  to  be  independent,  in 
addition  to  the  independent  variable,  t.  This  is  in  contrast  to  the  Lagrangian  L{ q,  q,  t ) which  is  a function 
of  the  n generalized  coordinates  qj,  the  corresponding  velocities  <jj,  and  time  t.  The  velocities  q are  the 
time  derivatives  of  the  coordinates  q and  thus  these  are  related.  In  physics,  the  fundamental  conjugate 
coordinates  are  (q,  p),  which  are  the  coordinates  underlying  the  Hamiltonian.  This  is  in  contrast  to  (q,  q) 
which  are  the  coordinates  that  underlie  the  Lagrangian.  Thus  the  Hamiltonian  is  more  fundamental  than 
the  Lagrangian  and  is  a reason  why  the  Hamiltonian  mechanics,  rather  than  the  Lagrangian  mechanics,  was 
used  as  the  foundation  for  development  of  quantum  and  statistical  mechanics. 

Hamiltonian  mechanics  will  be  derived  two  other  ways.  Chapter  8 uses  the  Legendre  transformation 
between  the  conjugate  variables  (q,  q,  t)  and  (q,  p,t)  where  the  generalized  coordinate  q and  its  conju- 
gate generalized  momentum,  p are  independent.  This  shows  that  Hamiltonian  mechanics  is  based  on  the 
same  variational  principles  as  those  used  to  derive  Lagrangian  mechanics.  Chapter  13  derives  Hamiltonian 
mechanics  directly  from  Hamilton’s  Principle  of  Least  action.  Chapter  8 will  introduce  the  algebraic  Hamil- 
tonian mechanics,  that  is  based  on  the  Hamiltonian.  The  powerful  capabilities  provided  by  Hamiltonian 
mechanics  will  be  described  in  chapter  14. 


7.14  Summary 

This  chapter  has  explored  the  importance  of  symmetries  and  invariance  in  Lagrangian  mechanics  and  has 
introduced  the  Hamiltonian.  The  following  are  the  main  points  introduced  in  this  chapter. 

Noether’s  theorem: 

Noether’s  theorem  explores  the  remarkable  connection  between  symmetry,  plus  the  invariance  of  a sys- 
tem under  transformation  and  related  conservation  laws  which  imply  the  existence  of  important  physical 
principles,  and  constants  of  motion.  Transformations  where  the  equations  of  motion  are  invariant  are  called 
invariant  transformations.  Variables  that  are  invariant  to  a transformation  are  called  cyclic  variables.  It 
was  shown  that  if  the  Lagrangian  does  not  explicitly  contain  a particular  coordinate  of  displacement,  qi  then 
the  corresponding  conjugate  momentum,  pi  is  conserved.  This  is  Noether’s  theorem  which  states  “ For  each 
symmetry  of  the  Lagrangian,  there  is  a conserved  quantity ".  In  particular  it  was  shown  that  translational 
invariance  in  a given  direction  leads  to  the  conservation  of  linear  momentum  in  that  direction,  and  rotational 
invariance  about  an  axis  leads  to  conservation  of  angular  momentum  about  that  axis.  These  are  the  first- 
order  spatial  and  angular  integrals  of  the  equations  of  motion.  Noether’s  theorem  also  relates  the  properties 
of  the  Hamiltonian  to  time  invariance  of  the  Lagrangian,  namely; 

(1)  H is  conserved  if,  and  only  if,  the  Lagrangian,  and  consequently  the  Hamiltonian,  are  not  explicit 
functions  of  time. 

(2)  The  Hamiltonian  gives  the  total  energy  if  the  constraints  and  coordinate  transformations  are  time 
independent  and  the  potential  energy  is  velocity  independent.  This  is  equivalent  to  stating  that  if  the  con- 
straints, or  generalized  coordinates,  for  the  system  are  time  independent  then  H = E. 

Noether’s  theorem  is  of  importance  since  it  underlies  the  relation  between  symmetries,  and  invariance  in 
all  of  physics;  that  is,  its  applicability  extends  beyond  classical  mechanics. 
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Generalized  momentum: 

The  generalized  momentum  associated  with  the  coordinate  qj  is  defined  to  be 


d L 

= Ui 

dQj 


(7.3) 


where  pj  is  also  called  the  conjugate  momentum  (or  canonical  momentum)  to  <y;  where  q:) . pj  are 
conjugate,  or  canonical,  variables.  Remember  that  the  linear  momentum  pj  is  the  first-order  time  integral 
given  by  equation  2.10.  Note  that  if  q:/  is  not  a spatial  coordinate,  then  pj  is  not  linear  momentum,  but  is 
the  conjugate  momentum.  For  example,  if  qj  is  an  angle,  then  pj  will  be  angular  momentum. 

Kinetic  energy  in  generalized  coordinates: 

It  was  shown  that  the  kinetic  energy  van  be  expressed  in  terms  of  generalized  coordinates  by 


T{ q,  q,f) 


EE 


1 

2m“ 


dxaq  dxaq  . . \ - \ - 

dqj  dqkmk  + 

J (Y.  r . n 


T2(q,q,  t)+T1( q,  q,t)  +T0(q,  t) 


m0 


dxa.i  dxa . 
dq3  3t 


Aj 


dxa 


dt 


(7.19) 

(7.53) 


For  scleronomic  systems  with  a potential  that  is  velocity  independent,  then  the  kinetic  energy  can  be 
expressed  as 

T = t2  = \Y1  lnpi  = ^ ■ p (7-31) 

i 

Generalized  energy 

Jacobi’s  Generalized  Energy  h(q,q,t)  was  defined  as 


H q,q,t) 


L( q,q,  t) 


(7.36) 


Hamiltonian  function 

The  Hamiltonian  H (q,  p,t)  was  defined  in  terms  of  the  generalized  energy  h( q,  q,  t)  and  by  introducing 
the  generalized  momentum.  That  is 

H (q,  p ,t)  = h( q,  q,  t)  = ^ PjQj  - i(q,  q,  f)  = p • q~i(q,  q,  t)  (7.37) 

3 


Generalized  energy  theorem 

The  equations  of  motion  lead  to  the  generalized  energy  theorem  which  states  that  the  time  dependence 
of  the  Hamiltonian  is  related  to  the  time  dependence  of  the  Lagrangian. 


dH  (q,  p ,t) 
dt 


Qfxc 


m rj 

E vfhq.O 


k= 1 


dL{<\,  q,  t) 
dt 


(7.38) 


Note  that  if  all  the  generalized  non-potential  forces  are  zero,  then  the  bracket  in  equation  7.38  is  zero,  and 
if  the  Lagrangian  is  not  an  explicit  function  of  time,  then  the  Hamiltonian  is  a constant  of  motion. 

Generalized  energy  and  total  energy: 

The  generalized  energy,  and  corresponding  Hamiltonian,  equal  the  total  energy  if: 

1)  The  kinetic  energy  has  a homogeneous  quadratic  dependence  on  the  generalized  velocities  and  the 
transformation  to  generalized  coordinates  is  independent  of  time,  dx£'‘  = 0. 

2)  The  potential  energy  is  not  velocity  dependent,  thus  the  terms  = 0. 

Chapter  8 will  introduce  Hamiltonian  mechanics  that  is  built  on  the  Hamiltonian,  and  chapter  14  will 
explore  applications  of  Hamiltonian  mechanics. 
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Workshop  exercises 


1.  Consider  a particle  of  mass  m moving  in  a plane  and  subject  to  an  inverse  square  attractive  force. 

(a)  Obtain  the  equations  of  motion. 

(b)  Is  the  angular  momentum  about  the  origin  conserved? 

(c)  Obtain  expressions  for  the  generalized  forces. 

2.  Consider  a Lagrangian  function  of  the  form  L(qi,  <ji,  qi,t).  Here  the  Lagrangian  contains  a time  derivative 
of  the  generalized  coordinates  that  is  higher  than  the  first.  When  working  with  such  Lagrangians,  the  term 
“generalized  mechanics”  is  used. 


(a)  Consider  a system  with  one  degree  of  freedom.  By  applying  the  methods  of  the  calculus  of  variations, 
and  assuming  that  Hamilton’s  principle  holds  with  respect  to  variations  which  keep  both  q and  q fixed  at 
the  end  points,  show  that  the  corresponding  Lagrange  equation  is 

fdL\  _ d_  LdL\  dL 


Such  equations  of  motion  have  interesting  applications  in  chaos  theory, 

(b)  Apply  this  result  to  the  Lagrangian 


Do  you  recognize  the  equations  of  motion? 


3.  A uniform  solid  cylinder  of  radius  R and  mass  M rests  on  a horizontal  plane  and  an  identical  cylinder  rests 
on  it  touching  along  the  top  of  the  first  cylinder  with  the  axes  of  both  cylinders  parallel.  The  upper  cylinder 
is  given  an  infinitessimal  displacement  so  that  both  cylinders  roll  without  slipping  in  the  directions  shown  by 
the  arrows. 


(a)  Find  Lagrangian  for  this  system 

(b)  What  are  the  constants  of  motion? 

(c)  Show  that  as  long  as  the  cylinders  remain  in  contact  then 

•2  12<7  (1  — cos  9) 

R (17  + 4 cos  8 — 4 cos2  8) 


4.  Consider  a diatomic  molecule  which  has  a symmetry  axis  along  the  line  through  the  center  of  the  two  atoms 
comprising  the  molecule.  Consider  that  this  molecule  is  rotating  about  an  axis  perpendicular  to  the  symmetry 
axis  and  that  there  are  no  external  forces  acting  on  the  molecule.  Use  Noether’s  Theorem  to  answer  the 
following  questions: 

a)  Is  the  total  angular  momentum  conserved? 

b)  Is  the  projection  of  the  total  angular  momentum  along  a space-fixed  z axis  conserved? 

c)  Is  the  projection  of  the  angular  momentum  along  the  symmetry  axis  of  the  rotating  molecule  conserved? 

d)  Is  the  projection  of  the  angular  momentum  perpendicular  to  the  rotating  symmetry  axis  conserved? 


7.14.  SUMMARY 


197 


5.  A bead  of  mass  ?n  slides  under  gravity  along  a smooth  wire  bent  in  the  shape  of  a parabola  x 2 = az  in  the 
vertical  (x,  Z ) plane. 

(a)  What  kind  (holonomic,  nonholonomic,  scleronomic,  rheonomic)  of  constraint  acts  on  ml 

(b)  Set  up  Lagrange’s  equation  of  motion  for  x with  the  constraint  embedded. 

(c)  Set  up  Lagrange’s  equations  of  motion  for  both  x and  2 with  the  constraint  adjoined  and  a Lagrangian 
multiplier  A introduced. 

(d)  Show  that  the  same  equation  of  motion  for  x results  from  either  of  the  methods  used  in  part  (b)  or  part 
(c). 

(e)  Express  A in  terms  of  X and  X. 

(f)  What  are  the  X and  0 components  of  the  force  of  constraint  in  terms  of  X and  xl 

Problems 

1.  Let  the  horizontal  plane  be  the  x — y plane.  A bead  of  mass  m is  constrained  to  slide  with  speed  v along  a 
curve  described  by  the  function  y = f(x).  What  force  does  the  curve  apply  to  the  bead?  (Ignore  gravity) 

2.  Consider  the  Atwoods  machine  shown.  The  masses  are  4m,  5m,  and  3m.  Let  x and  y be  the  heights  of  the 
right  two  masses  relative  to  their  initial  positions. 

a)  Solve  this  problem  using  the  Euler-Lagrange  equations 

b)  Use  Noether’s  theorem  to  find  the  conserved  momentum. 


3.  A cube  of  side  2b  and  center  of  mass  C,  is  placed  on  a fixed  horizontal  cylinder  of  radius  r and  center  O as 
shown  in  the  figure.  Originally  the  cube  is  placed  such  that  C is  centered  above  O but  it  can  roll  from  side  to 
side  without  slipping,  (a)  Assuming  that  b < r use  the  Lagrangian  approach  to  to  find  the  frequency  for  small 
oscillations  about  the  top  of  the  cylinder.  For  simplicity  make  the  small  angle  approximation  for  L before  using 
the  Lagrange-Euler  equations,  (b)  What  will  be  the  motion  if  b > r 1 Note  that  the  moment  of  inertia  of  the 
cube  about  the  center  of  mass  is  | mb~ . 
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4.  Two  equal  masses  of  mass  m are  glued  to  a massless  hoop  of  radius  R is  free  to  rotate  about  its  center  in  a 
vertical  plane.  The  angle  between  the  masses  is  29,  as  shown.  Find  the  frequency  of  oscillations. 


5.  Three  massless  sticks  each  of  length  2 r,  and  mass  to  with  the  center  of  mass  at  the  center  of  each  stick,  are 
hinged  at  their  ends  as  shown.  The  bottom  end  of  the  lower  stick  is  hinged  at  the  ground.  They  are  held  so 
that  the  lower  two  sticks  are  vertical,  and  the  upper  one  is  tilted  at  a small  angle  e with  respect  to  the  vertical. 
They  are  then  released.  At  the  instant  of  release  what  are  the  three  equations  of  motion  derived  from  the 
Lagrangian  derived  assuming  that  £ is  small?  Use  these  to  determine  the  initial  angular  accelerations  of  the 
three  sticks. 


Chapter  8 


Hamiltonian  mechanics 


8.1  Introduction 


The  three  major  formulations  of  classical  mechanics  are 

1.  Newtonian  mechanics  which  is  the  most  intuitive  vector  formulation  used  in  classical  mechanics. 

2.  Lagrangian  mechanics  is  a powerful  algebraic  formulation  of  classical  mechanics  derived  using  either 
d’Alembert’s  Principle,  or  Hamilton’s  Principle.  The  latter  states  ”A  dynamical  system  follows  a path 
that  minimizes  the  time  integral  of  the  difference  between  the  kinetic  and  potential  energies”. 

3.  Hamiltonian  mechanics  has  a beautiful  superstructure  that,  like  Lagrangian  mechanics,  is  built 
upon  variational  calculus,  Hamilton’s  principle,  and  Lagrangian  mechanics. 


Hamiltonian  mechanics  is  introduced  at  this  juncture  since  it  is  closely  interwoven  with  Lagrange  mechan- 
ics. Hamiltonian  mechanics  plays  a fundamental  role  in  modern  physics,  but  the  discussion  of  the  important 
role  it  plays  in  modern  physics  will  be  deferred  until  chapters  14  and  17  where  applications  to  modern  physics 
are  addressed. 

The  following  important  concepts  were  introduced  in  chapter  7: 

The  generalized  momentum  was  defined  to  be  given  by 


= dL{ q,  q,f) 
dq 


Note  that,  as  discussed  in  chapter  7.2,  if  the  potential  is  velocity  dependent,  such  as  the  Lorentz  force,  then 
the  generalized  momentum  includes  terms  in  addition  to  the  usual  mechanical  momentum. 

Jacobi’s  generalized  energy  function  h(q,  q,  t)  was  introduced  where 


h(q,q,t)  = ^ J - L(q,q,t)  (8.2) 

The  Hamiltonian  function  was  defined  to  be  given  by  expressing  the  generalized  energy  function, 
equation  8.2,  in  terms  of  the  generalized  momentum.  That  is,  the  Hamiltonian  iL(q,  p,  t)  is  expressed  as 

n 

tf(q,P,*)  = ^PiQi  “ L(q,ci,t)  (8.3) 

i 

The  symbols  q,  p,  designate  vectors  of  n generalized  coordinates,  q = (<?i,  <72,  • •</«),  P = (Pi,P2,  --Pn)- 
Equation  8.3  can  be  written  compactly  in  a symmetric  form  using  the  scalar  product  p ■ q = (C;  PiQi- 

H (q,P,t)  + £(q,q,t)  = P • q (8.4) 

A crucial  feature  of  Hamiltonian  mechanics  is  that  the  Hamiltonian  is  expressed  as  H (q,  p,t),  that 
is,  it  is  a function  of  the  n generalized  coordinates  and  their  conjugate  momenta,  which  are  taken  to  be 
independent , plus  the  independent  variable,  time.  This  contrasts  with  the  Lagrangian  L(q,  q,  t ) which  is  a 
function  of  the  n generalized  coordinates  qj , and  the  corresponding  velocities  qj , that  is  the  time  derivatives 
of  the  coordinates  qi,  plus  the  independent  variable,  time. 
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8.2  Legendre  Transformation  between  Lagrangian  and  Hamiltonian 
mechanics 

Hamiltonian  mechanics  can  be  derived  directly  from  Lagrange  mechanics  by  considering  the  Legendre  trans- 
formation between  the  conjugate  variables  (q,  q,  t)  and  (q,  p,t).  Such  a derivation  is  of  considerable  im- 
portance in  that  it  shows  that  Hamiltonian  mechanics  is  based  on  the  same  variational  principles  as  those 
used  to  derive  Lagrangian  mechanics;  that  is  d’Alembert’s  Principle  and  Hamilton’s  Principle.  The  general 
problem  of  converting  Lagrange’s  equations  into  the  Hamiltonian  form  hinges  on  the  inversion  of  equation 
(8.1)  that  defines  the  generalized  momentum  p.  This  inversion  is  simplified  by  the  fact  that  (8.1)  is  the  first 
partial  derivative  of  the  Lagrangian  scalar  function  L(q,  q,  t). 

As  described  in  appendix  FA.  consider  transformations  between  two  functions  F{ u,  w)  and  G(v,w), 
where  u and  v are  the  active  variables  related  by  the  functional  form 

v = VuF(u,w)  (8.5) 

and  where  w designates  passive  variables.  The  function  VuT(u,  w)  is  the  first-order  derivative,  (gradient) 
of  F(u,w)  with  respect  to  the  components  of  the  vector  u.  The  Legendre  transform  states  that  the  inverse 
formula  can  always  be  written  as  a first-order  derivative 

u = VvG(v,  w)  (8.6) 

The  function  G(v,w)  is  related  to  F(u,  w)  by  the  symmetric  relation 

G(v,  w)+f?(u,  w)  = u • v (8.7) 


where  the  scalar  product  u • v = Y^h= i uivi- 

Furthermore  the  first-order  derivatives  with  respect  to  all  the  passive  variables  Wi  are  related  by 

VwF(u,w)  = -VwG(v,w)  (8.8) 

The  relationship  between  the  functions  F( u,  w)  and  G(v,w)  is  symmetrical  and  each  is  said  to  be  the 
Legendre  transform  of  the  other. 

The  general  Legendre  transform  can  be  used  to  relate  the  Lagrangian  and  Hamiltonian  by  identifying  the 
active  variables  v with  p,  and  u with  q,  the  passive  variable  w with  q,t,  and  the  corresponding  functions 
F( u,  w)  =L(q,  q,i)  and  G(v,w)  =iL(q,  p.f).  Thus  the  generalized  momentum  (8.1)  corresponds  to 

p = Vqi(q,q,t)  (8.9) 

where  (q,t)  are  the  passive  variables.  Then  the  Legendre  transform  states  that  the  transformed  variable  q 
is  given  by  the  relation 

q = VplL(q,p,t)  (8.10) 

Since  the  functions  i(q,  q,t)  and  H(q,p,t)  are  the  Legendre  transforms  of  each  other,  they  satisfy  the 
relation 

H (q,  p,  t)  +T(q,  q,  t)  = p • q (8.11) 

The  function  H (q,  p,  t),  which  is  the  Legendre  transform  of  the  Lagrangian  L{ q,  q,  t ),  is  called  the  Hamil- 
tonian function  and  equation  (8.11)  is  identical  to  our  original  definition  of  the  Hamiltonian  given  by 
equation  (8.3).  The  variables  q and  t are  passive  variables  thus  equation  (8.8)  gives  that 

VqL(q,q,f)  = -Vqtf(p,q,f)  (8.12) 

Written  in  component  form  equation  8.12  gives  the  partial  derivative  relations 

dL{ q,  q,f)  = dH( p,  q,  t) 

dqt  dqi 

dL(q,q,f)  _ dH(p,q,t) 


dt 


dt 


(8.14) 
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Note  that  equations  8.13  and  8.14  are  strictly  a result  of  the  Legendre  transformation.  To  complete  the 
transformation  from  Lagrangian  to  Hamiltonian  mechanics  it  is  necessary  to  invoke  the  calculus  of  variations 
via  the  Lagrange-Euler  equations.  The  symmetry  of  the  Legendre  transform  is  illustrated  by  equation  8.11. 

Equation  7.31  gives  that  the  scalar  product  p ■ q =222.  For  scleronomic  systems,  with  velocity  indepen- 
dent potentials  U,  the  standard  Lagrangian  L = T — U and  H = 2 T — T + U = T + U.  Thus,  for  this  simple 
case,  equation  8.11  reduces  to  an  identity  H + L = 2 T. 


8.3  Hamilton’s  equations  of  motion 

The  explicit  form  of  the  Legendre  transform  8.10  gives  that  the  time  derivative  of  the  generalized  coordinate 
Qj  is 

. dH(q,  p,t) 


IT 


dPj 


The  Euler-Lagrange  equation  6.60  is 


±dL  _ cbL  dgu 

dtdqj  dqj  ^ k <>q, 


EXC 

3 


dt  dqj  Pj  dq 


dL  • i>^+<3?xc 


This  gives  the  corresponding  Hamilton  equation  for  the  time  derivative  of  pi  to  be 

d dL 

■h  ''''  d<L  ' i,q.i 
Substitute  equation  8.13  into  equation  8.17  leads  to  the  second  Hamilton  equation  of  motion 

dH(q,p,t) 


Pi  = — 


dqj 


k= 1 


EA4fi+<3 


-\EXC 


(8.15) 


(8.16) 


(8.17) 


(8.18) 


One  can  explore  further  the  implications  of  Hamiltonian  mechanics  by  taking  the  time  differential  of  (8.3) 
giving. 

dH{q,p,t)  v—  ( . dpj  dqj  dL  dqj,  dL  dqA  dL 

dt  + Pj  dt  dn;  dt  dth  dt  I dt 


dt 


Inserting  the  conjugate  momenta  pt  = and  equation  8.17  into  equation  8.19  results  in 


dll^.p.n  _^fh._p  dqi 


m o 


k= 1 


dt  ^ \^J  dt 

3 \ 

The  second  and  fourth  terms  cancel  as  well  as  the  qjpj  terms,  leaving 

dH(q,p,t) 


dt 


= E 


k=  1 


Qj 


9j  ~ Pi 


dL 

~dt 


d4j_ 

dt 


dL 

~dt 


(8.20) 


(8.21) 


This  is  the  generalized  energy  theorem  given  by  equation  7.38. 
The  total  differential  of  the  Hamiltonian  also  can  be  written  as 


dH{q,p,t ) 
dt 


dH  . dH 


E/  un  . un  . \ un 

l d^Pj  + dAqj ) + ~dt 


dqj 


dH 


dt 


(8.22) 


Use  equations  8.15  and  8.18  to  substitute  for  and  in  equation  8.22  gives 


dqj 


dH{q,p,t) 

dt 


= E 


,k= 1 


X>^+« 

^ dqj 


jEXC 

0 


9j 


dH(q,p,t) 

dt 


(8.23) 
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Note  that  equation  8.23  must  equal  the  generalized  energy  theorem  equation  8.21.  Therefore, 

d H _ dL 
dt  dt 


(8.24) 


In  summary,  Hamilton’s  equations  of  motion  are  given  by 


_ dH(q,p,t) 

% 9Pj 

dH(q,p,t)  , 

Pj  = x + 

dqj 

m rv 

Its 

dH(q,p,t) 

dt  YV 

m r\ 

Va  k^  + Qfxc 
[ti  d(9  \ 

EXC 


Q 


ij  - 


9L{ q,  q ,t) 
dt 


(8.25) 

(8.26) 

(8.27) 


The  symmetry  of  Hamilton’s  equations  of  motion  is  illustrated  when  the  Lagrange  multiplier  and  gener- 
alized forces  are  zero.  Then 


Qj 

Pi 

dH(p,q,t) 

dt 


dH(q,p,t) 

dPj 

dH{  p.q,f) 
dq:j 

dH( p,q,f)  = dL{ q,  q ,t) 
dt  dt 


(8.28) 

(8.29) 

(8.30) 


This  simplified  form  illustrates  the  symmetry  of  Hamilton’s  equations  of  motion.  Many  books  present 
the  Hamiltonian  only  for  this  simplified  case  where  it  is  holonomic,  conservative,  and  generalized  coordinates 
are  used. 


8.3.1  Canonical  equations  of  motion 

Hamilton’s  equations  of  motion,  summarized  in  equations  8.25  — 27,  use  either  a minimal  set  of  generalized 
coordinates  or  the  Lagrange  multiplier  terms  to  account  for  holonomic  constraints,  or  generalized  forces 
qexg  account  for  non-holonomic  or  other  forces.  Hamilton’s  equations  of  motion  usually  are  called 
the  canonical  equations  of  motion.  The  term  canonical  has  nothing  to  do  with  religion  or  canon  law; 
the  reason  for  this  name  has  bewildered  many  generations  of  students  of  classical  mechanics.  The  term  was 
introduced  by  Jacobi  in  1837  to  designate  a simple  and  fundamental  set  of  conjugate  variables  and  equations. 
Note  the  symmetry  of  Hamilton’s  two  canonical  equations,  and  the  fact  that  the  canonical  variables  Pk,qk 
are  treated  as  independent  canonical  variables.  The  Lagrange  mechanics  coordinates  (q,  q,f)  are  replaced  by 
the  Hamiltonian  mechanics  coordinates  (q,  p,t)  where  the  conjugate  momenta  p are  taken  to  be  independent 
of  the  coordinate  q. 

Lagrange  was  the  first  to  derive  the  canonical  equations  but  he  did  not  recognize  them  as  a basic  set  of 
equations  of  motion.  Hamilton  derived  the  canonical  equations  of  motion  from  his  fundamental  variational 
principle,  chapter  13.2,  and  made  them  the  basis  for  a far-reaching  theory  of  dynamics.  Hamilton’s  equations 
give  2s  first-order  differential  equations  for  Pk,qk  for  each  of  the  s = n — m degrees  of  freedom.  Lagrange’s 
equations  give  s second-order  differential  equations  for  the  s independent  generalized  coordinates  qk,qk- 

It  has  been  shown  that  i7(p,q,  t)  and  L(q,  q,f)  are  the  Legendre  transforms  of  each  other.  Although 
the  Lagrangian  formulation  is  ideal  for  solving  numerical  problems  in  classical  mechanics,  the  Hamiltonian 
formulation  provides  a better  framework  for  conceptual  extensions  to  other  fields  of  physics  since  it  is  written 
in  terms  of  the  fundamental  conjugate  coordinates,  q,  p.  The  Hamiltonian  is  used  extensively  in  modern 
physics,  including  quantum  physics,  as  discussed  in  chapters  14  and  17.  For  example,  in  quantum  mechanics 
there  is  a straightforward  relation  between  the  classical  and  quantal  representations  of  momenta;  this  does 
not  exist  for  the  velocities. 

The  concept  of  state  space,  introduced  in  chapter  3.3.2,  applies  naturally  to  Lagrangian  mechanics  since 
(g,  q)  are  the  generalized  coordinates  used  in  Lagrangian  mechanics.  The  concept  of  Phase  Space,  introduced 
in  chapter  3.3.3,  naturally  applies  to  Hamiltonian  phase  space  since  (p,  q)  are  the  generalized  coordinates 
used  in  Hamiltonian  mechanics. 
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8.4  Hamiltonian  in  different  coordinate  systems 

Prior  to  solving  problems  using  Hamiltonian  mechanics,  it  is  useful  to  express  the  Hamiltonian  in  cylindrical 
and  spherical  coordinates  for  the  special  case  of  conservative  forces  since  these  are  encountered  frequently 
in  physics. 


8.4.1  Cylindrical  coordinates  p,z,(j> 

Consider  cylindrical  coordinates  p,  z,  <j>.  Expressed  in  cartesian  coordinate 


x = pcoscj)  (8.31) 

y = p sin  (f) 

z = z 


Using  appendix  table  C.3,  the  Lagrangian  can  be  written  as 


L = T — U = ^ (p2  + p2 4>2  + i2)  — U (p,  z,  4>) 

The  conjugate  momenta  are 

dL 

Pp  = gz  = mp 

= NV  = mP2^ 
dq> 

dL 

Pz  = -W7-  = mz 


(8.32) 

(8.33) 

(8.34) 

(8.35) 


Assume  a conservative  force,  then  H is  conserved.  Since  the  transformation  from  cartesian  to  non- 
rotating generalized  cylindrical  coordinates  is  time  independent,  then  H = E.  Then  using  (8.32  — 8.35)  gives 
the  Hamiltonian  in  cylindrical  coordinates  to  be 


H (q,  p,  t) 


^2 POi  - £(q,q,t) 


PpP+P^+PzZ)  - — 


m 


1 

2 TO 


U{p,z,4>) 


U(p,z,<j>) 


(8.36) 


(8.37) 


The  canonical  equations  of  motion  in  cylindrical  coordinates  can  be  written  as 


Pp  = 

dH  _ p\ 
dp  mp 3 

dU 

dp 

(8.38) 

P4>  = 

dH  dU 

dcj)  dcj) 

(8.39) 

Pz  = 

dH  dU 

dz  dz 

(8.40) 

P = 

dpp  m 

(8.41) 

<t>  = 

dH  _ 
dP(j>  mp 2 

(8.42) 

z = 

dH  _ pz 
dpz  m 

(8.43) 

Note  that  if  $ is  cyclic,  that  is  ^ = 0,  then  the  angular  momentum  about  the  2 axis,  p^,  is  a constant 
of  motion.  Similarly,  if  2 is  cyclic,  then  pz  is  a constant  of  motion. 


204 


CHAPTER  8.  HAMILTONIAN  MECHANICS 


8.4.2  Spherical  coordinates,  r,  6 , </> 

Appendix  table  CA  shows  that  the  spherical  coordinates  are  related  to  the  cartesian  coordinates  by 

x = rsin0cos0  (8.44) 

y = r sin  9 sin  <f> 

z = r cos  6 


The  Lagrangian  is 

I 

The  conjugate  momenta  are 


= — ^ r 2 + r2  9 + r2  sin2  9<j>  ^ — U(r9<j>) 

(8.45) 

dL 

Pr  = v =mr 
dr 

(8.46) 

dL  o- 

pg  = — - = mr  9 

(8.47) 

d9 

9L  2 • 2 /)  ': 

pg,  = — - = mr  sm  9<p 

(8.48) 

dcj) 

Assuming  a conservative  force  then  H is  conserved.  Since  the  transformation  from  cartesian  to  generalized 
spherical  coordinates  is  time  independent,  then  H = E.  Thus  using  (8.46  — 8.48)  the  Hamiltonian  is  given 
by 


tf(q,p,f)  = Y^PiQi- 

i 

= (prf  + Pod  + Pc/,4)  - y (r2  + r292  + r2  sin2  9<f  ) +U(r,  9 , </>) 

U(r,9,<j>) 


1 


< „2 
I P2  + — i 

2m  \Pr  r2  r2  sin2  9 


4 


Then  the  canonical  equations  of  motion  in  spherical  coordinate  are 

1 


Pr 


d H 

dr  mr 


.3  \P2e 


4 


sin 


dU 

dr 


P®  o/i 


P<t>  ~ a a ~ 


d h 

d9 

dH 

dcf) 


1 


mra 


Pi  cos  9 
sin3  9 


dU 

~dO 


dU_ 

dcj) 


r = 


9 = 


dH  _ pr 
dpr  m 
dH  pe 


dpg  mr2 
dH  _ 


dp<j)  mr2  sin2  9 


(8.49) 

(8.50) 

(8.51) 


(8.52) 

(8.53) 

(8.54) 

(8.55) 

(8.56) 

(8.57) 


Note  that  if  the  coordinate  <f>  is  cyclic,  that  is  ^ = 0 then  the  angular  momentum  p $ is  conserved.  Also 
if  the  9 coordinate  is  cyclic,  and  p $ = 0,  that  is,  there  is  no  change  in  the  angular  momentum  perpendicular 
to  the  z axis,  then  pg  is  conserved. 

An  especially  important  spherically-synnnetric  Hamiltonian  is  that  for  a central  field.  Central  fields,  such 
as  the  gravitational  or  Coulomb  fields  of  a uniform  spherical  mass,  or  charge,  distributions,  are  spherically 
symmetric  and  then  both  9 and  <j)  are  cyclic.  Thus  the  projection  of  the  angular  momentum  p $ about  the  2 
axis  is  conserved  for  these  spherically  symmetric  potentials.  In  addition,  since  both  pg  and  p#,  are  conserved, 
then  the  total  angular  momentum  also  must  be  conserved  as  is  predicted  by  Noether’s  theorem. 
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8.5  Applications  of  Hamiltonian  Dynamics 

The  equations  of  motion  of  a system  can  be  derived  using  the  Hamiltonian  coupled  with  Hamilton’s  equations 
of  motion,  that  is,  equations  8.25  — 8.27. 

Formally  the  Hamiltonian  is  constructed  from  the  Lagrangian.  That  is 

1)  Select  a set  of  independent  generalized  coordinates  q,: 

2)  Partition  the  active  forces. 

3)  Construct  the  Lagrangian  L(qi,qi,t ) 

4)  Derive  the  conjugate  generalized  momenta  via  pt  = ffj- 

5)  Knowing  L,quPi  derive  H = J^iPif  ~ L 

6)  Derive  qk  = §£  and  p3  = - ""fi"''  + EfcLi  Afc§f  + Q?X° • 

This  procedure  appears  to  be  unnecessarily  complicated  compared  to  just  using  the  Lagrangian  plus 
Lagrangian  mechanics  to  derive  the  equations  of  motion.  Fortunately  the  above  lengthy  procedure  often  can 
be  bypassed  for  conservative  systems.  That  is,  if  the  following  conditions  are  satisfied; 

i)  L = T{q)  — U(q),  that  is,  U (q)  is  independent  of  the  velocity  q. 

ii)  the  generalized  coordinates  are  time  independent, 
then  it  is  possible  to  use  the  fact  that  H = T + U = E. 

The  following  five  examples  illustrate  the  use  of  Hamiltonian  mechanics  to  derive  the  equations  of  motion. 


8.1  Example:  Motion  in  a uniform  gravitational  field 

Consider  a mass  m in  a uniform  gravitational  field  acting  in  the  — z direction.  The  Lagrangian  for  this 
simple  case  is 

L = -m  (a:2  +y2  + z 2)  — mgz 

Therefore  the  generalized  momenta  are  px  = §§  = mx , py  = = my,  Pz  = %;  = mz.  The  corresponding 

Hamiltonian  H is 

H = Y^Piih  ~ L = PxX  + Pyij  + pzz  — L 


Ei  + ^y.  + P±-  I f Pi  + ?y+P± 

m TO  TO  2 l TO  TO  TO 


1 pi  Py  pi 

' m9z  — 7. 1 1 - 

2 \ TO  TO  TO 


Combining  these  gives  that  x = 0,  y = 0, 5 = — g.  Note  that  the  linear  momenta  px  and  py  are  constants 
of  motion  whereas  the  rate  of  change  of  pz  is  given  by  the  gravitational  force  mg.  Note  also  that  H = T + U 
for  this  conservative  system. 


8.2  Example:  One- dimensional  harmonic  oscillator 

Consider  a mass  to  subject  to  a linear  restoring  force  with  constant  k.  The  Lagrangian  L = T — U equals 

L = \mx2  - \kx2 
2 2 


Therefore  the  generalized  momentum  is 
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The  Hamiltonian  H is 


H 


~ l = pxx  - L 

i 


PxPx 

m 


2 m 


+ \kx 2 


\Pl+l-kx 

2 m 2 


2 


Note  that  the  Lagrangian  is  not  explicitly  time  dependent,  thus  the  Hamiltonian  will  be  a constant  of  motion. 
Hamilton’s  equations  give  that 

. dH  px 
x = -k—  = — 
opx  m 


or 


px  = mx 


In  addition 


Combining  these  gives  that 


dH  dU  , 

-Px  = = kx 

dx  dx 


x H x = 0 

m 


which  is  the  equation  of  motion  for  the  harmonic  oscillator. 


8.3  Example:  Plane  pendulum 

The  plane  pendulum,  in  a uniform  gravitational  field  g,  is  an  interesting  system  to  consider.  There  is 
only  one  generalized  coordinate,  6 and  the  Lagrangian  for  this  system  is 


1 2 • 2 

L = -ml  6 + mgl  cos  9 


The  momentum  conjugate  to  9 is 


dL  ,2 ; 

Pe  = d9  = U 


which  is  the  angidar  momentum  about  the  pivot  point. 
The  Hamiltonian  is 


H = ^Pifi  - L = 


L = ^ ml20 


mgl  cos  9 


Pe 

2ml2 


mgl  cos  9 


Hamilton’s  equations  of  motion  give 


dH  _ p0 
dpg  ml 2 


— mgl  sin  9 


Note  that  the  Lagrangian  and  Hamiltonian  are  not  explicit  functions  of  time,  therefore  they  are  conserved. 
Also  the  potential  is  velocity  independent  and  there  is  no  coordinate  transformation,  thus  the  Hamiltonian 
equals  the  total  energy,  that  is 


H 


Pe 

2 ml'2 


mgl  cos  9 = E 


where  E is  a constant  of  motion.  Note  that  the  angular  momentum  pg  is  not  a constant  of  motion  since  pg 
explicitly  depends  on  9. 
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The  solutions  for  the  plane  pendulum  on  a (9,p$)  phase  di- 
agram, shown  in  the  adjacent  figure,  illustrate  the  motion.  The 
upper  phase-space  plot  shows  the  range  (9  = ±7 r,pg).  Note  that 
the  9 = +7T  and  — 7r  correspond  to  the  same  physical  point,  that  is 
the  phase  diagram  should  he  ivlled  into  a cylinder  connected  along 
the  dashed  lines.  The  lower  phase  space  plot  shows  two  cycles  for 
9 to  better  illustrate  the  cyclic  nature  of  the  phase  diagram.  The 
corresponding  state-space  diagram  is  shown  in  figure  3.4.  The 
trajectories  are  ellipses  for  low  energy  —mgl  < E < mgl  corre- 
sponding to  oscillations  of  the  pendulum  about  9 = 0.  The  center 
of  the  ellipse  (0, 0)  is  a stable  equilibrium  point  for  the  oscillation. 
However,  there  is  a phase  change  to  rotational  motion  about  the 
horizontal  axis  when  \E\  > mgl,  that  is,  the  pendulum  swings 
around  a circle  continuously,  i.e.  it  rotates  continuously  in  one 
direction  about  the  horizontal  axis.  The  phase  change  occurs  at 
E = mgl.  and  is  designated  by  the  separatrix  trajectory. 

The  plot  of  pg  versus  9 for  the  plane  pendulum  is  better  pre- 
sented on  a cylindrical  phase  space  representation  since  9 is  a 
cyclic  variable  that  cycles  around  the  cylinder,  whereas  pg  oscil- 
lates equally  about  zero  having  both  positive  and  negative  values. 
When  wrapped  around  a cylinder  then  the  unstable  and  stable 
eqiLilibrium  points  will  be  at  diametrically  opposite  locations  on 
the  surface  of  the  cylinder  at  pg  = 0.  For  small  oscillations 
about  equilibrium,  also  called  librations,  the  correlation  between 
pg  and  9 is  given  by  the  clockwise  closed  ellipses  wrapped  on  the 
cylindrical  surface,  whereas  for  energies  \E\  > mgl  the  positive 
pg  corresponds  to  counterclockwise  rotations  while  the  negative 
pg  corresponds  to  clockwise  rotations. 


Phase-space  diagrams  for  the  plane 
pendulum.  The  separatrix  (bold  line) 
separates  the  oscillatory  solutions  from 
the  rolling  solutions.  The  upper  (a) 
shows  one  complete  cycle  while  the  lower 
(b)  shows  two  complete  cycles. 


8.4  Example:  Hooke’s  law  force  constrained  to  the  surface  of  a cylinder 

Consider  the  case  where  a mass  m is  attracted  by  a 
force  directed  toward  the  origin  and  proportional  to  the 

distance  from  the  origin.  Determine  the  Hamiltonian  z 

if  the  mass  is  constrained  to  move  on  the  surface  of  a 
cylinder  defined  by 


x2  +y2  = R2 

It  is  natural  to  transform  this  problem  to  cylindrical  co- 
ordinates p,z,9.  Since  the  force  is  just  Hooke’s  law 

F = — kr 

the  potential  is  the  same  as  for  the  harmonic  oscillator, 
that  is 

U = ^ kr 2 = \k{p2  + z2) 

This  is  independent  of  9,  and  thus  9 is  cyclic. 

In  cylindrical  coordinates  the  velocity  is 

V = p + p 9 + Z 

Confined  to  the  surface  of  the  cylinder  means  that 


Mass  attracted  to  origin  by  force  proportional  to 
distance  from  origin  with  the  motion  constrained 
to  the  surface  of  a cylinder. 


p = R 
P = o 
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Then  the  Lagrangian  simplifies  to 

L = T-U  = ^m  [R2d2  + i2)  - i k(R 2 + z2) 

The  generalized  coordinates  are  6,  z and  the  corresponding  generalized  momenta  are 


p9  = ^ = mR29 
dd 

d L 

Pz  = = rnz 

oz 


(a) 


(b) 


The  system  is  conservative,  and  the  transformation  from  rectangidar  to  cylindrical  coordinates  does  not 
depend  explicitly  on  time.  Therefore  the  Hamiltonian  is  conserved  and  equals  the  total  energy.  That  is 


Pi 


H - -L-  ^2  + £ + R 2 + -2)  - E 


The  equations  of  motion  then  are  given  by  the  canonical  equations 


Pe  = 

Pz  = 

Equation  (a)  and  (c)  imply  that 


dH 

~d9 

dH 

dz 


= 0 


-kz 


z = 


dH  _ pe 
dpg  mR2 
dH_  = p± 

dpz  m 


(c) 

(d) 


dL 

Pe  = — - = mR29  = constant 

de 

Thus  the  angular  momentum  about  the  axis  of  the  cylinder  is  conserved,  that  is,  it  is  a cyclic  variable. 
Combining  equations  (b)  and  (d)  implies  that 

..  k 

z H 2 = 0 

m 

This  is  the  equation  for  simple  harmonic  motion  with  angular  frequency  u>  = The  symmetries  imply 

that  this  problem  has  the  same  solutions  for  the  z coordinate  as  the  harmonic  oscillator,  while  the  9 coordinate 
moves  with  constant  angular  velocity. 


8.5  Example:  Electron  motion  in  a cylindrical  magnetron 

A magnetron  comprises  a hot  cylindrical  wire  cathode  that  emits  electrons  and  is  at  a high  negative  voltage. 
It  is  surrounded  by  a larger  diameter  cylindrical  anode  at  ground  potential.  A uniform  magnetic  field  runs 
parallel  to  the  cylindrical  axis  of  the  magnetron.  The  electron  beam  excites  a multiple  set  of  microwave 
cavities  located  around  the  circumference  of  the  cylindrical  wall  of  the  anode.  The  magnetron  was  invented 
in  England  during  World  War  2 to  generate  microwaves  required  for  the  development  of  radar. 

Consider  a non-relativistic  electron  of  mass  m and  charge  — e in  a cylindrical  magnetron  moving  between 
the  central  cathode  wire,  of  radius  a at  a negative  electric  potential  —<t)0,  and  a concentric  cylindrical  anode 
conductor  of  radius  R which  has  zero  electric  potential.  There  is  a uniform  constant  magnetic  field  B parallel 
to  the  cylindrical  axis  of  the  magnetron. 

Using  SI  units  and  cylindrical  coordinates  (r,  9,  z ) aligned  with  the  axis  of  the  magnetron,  the  electromag- 
netic force  Lagrangian,  given  in  chapter  6.10,  equals 

L = + e(</)  — r • A) 

The  electric  and  vector  potentials  for  the  magnetron  geometry  are 
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Thus  expressed  in  cylindrical  coordinates  the  Lagrangian  equals 


1 


L = ^ to  [r2  + rz9  + z2)  + efi  — ^ eBr29 


„2  h 


1 


The  generalized  momenta  are 


Pr  = 
Pe  = 

Pz  = 


dL 

dr 

dL 


= mr 


1 


— - = mr29 eBr 2 

d9  2 

dL 

— = mz 
dz 


Note  that  the  vector  potential  A contributes  an  additional  term  to  the  angular  momentum  pg . 
Using  the  above  generalized  momenta  leads  to  the  Hamiltonian 

H = prf  + pg9  + p~z  — L 

= (r2  + r29~  + z2\  — e<f)+  i eBr29 

Z ' / Z 

o _ 

1 


Pr  

2 m 2 mr2 

„2 


Ps  + 2eBr2  ) 


1 

2m 


Pr 


— + \eBr  ) +p 2 
r 2 


2 TO 

— 


-/Voie  that  the  Hamiltonian  is  not  an  explicit  function  of  time,  therefore  it  is  a constant  of  motion  which 
equals  the  total  energy. 


H = 


2 TO 


Pl+[^r  + \eBr)  +p 2 


— ef)=  E 


Since  Pi  = — and  if  H is  not  an  explicit  function  of  q i,  then  pi  = 0,  that  is,  Pi  is  a constant  of  motion. 
Thus  pg  and  pz  are  constants  of  motion. 

Consider  the  initial  conditions  r = a,r  = 9 = z = 0.  Then 

dL  9-  1 o 

pg  = — ^ = mr  9 eBr  = — eBa 

d9  2 2 

Pz  = 0 


H 


1 

2 TO 


Pd  1 T-, 

h -eBr 

r 2 


+ ecj)0 


Mi) 

Mf) 


— e(f>0 


Note  that  at  r = i?,  i/ien  pr  is  given  by  the  last  eqiLation  since  the  Hamiltonian  equals  a constant  e(f0.  That 
is,  assuming  that  a « R then 

p2r  = 2 me(j)0  - (i eBR )2 

Define  a critical  magnetic  field  by 

2 /2to</>0 

b*  = r\  — 

then 

(p2r)r=R  = (B2c-B2)^eR)2 

Note  that  if  B < Bc  then  pr  is  real  at  r = R.  However,  if  B > Bc  then  pr  is  imaginary  at  r = R 
implying  that  there  must  be  a maximum  orbit  radius  r0  for  the  electron  where  ro  < R.  That  is,  the  electron 
trajectories  are  confined  spatially  to  coaxial  cylindrical  orbits  concentric  with  the  magnetron  electromagnetic 
fields.  These  closed  electron  trajectories  excite  the  microwave  cavities  located  in  the  nearby  outer  cylindrical 
wall  of  the  anode. 
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8.6  Routhian  reduction 

Noether’s  theorem  states  that  if  the  coordinate  qj  is  cyclic,  and  if  the  Lagrange  multiplier  plus  generalized 
force  contributions  for  the  jth  coordinates  are  zero,  then  the  canonical  momentum  of  the  cyclic  variable,  pj,  is 
a constant  of  motion  as  is  discussed  in  chapter  7.3.  Therefore,  both  ( qj,pj ) are  constants  of  motion  for  cyclic 
variables,  and  these  constant  ( qj,Pj ) coordinates  can  be  factored  out  of  the  Hamiltonian  i7(p,q,  t).  This 
reduces  the  number  of  degrees  of  freedom  included  in  the  Hamiltonian.  For  this  reason,  cyclic  variables  are 
called  ignorable  variables  in  Hamiltonian  mechanics.  This  advantage  does  not  apply  to  the  ( qj,qj ) variables 
used  in  Lagrangian  mechanics  since  q is  not  a constant  of  motion  for  a cyclic  coordinate.  The  ability 
to  eliminate  the  cyclic  variables  as  unknowns  in  the  Hamiltonian  is  a valuable  advantage  of  Hamiltonian 
mechanics  that  is  exploited  extensively  for  solving  problems,  as  is  described  in  chapter  14. 

It  is  advantageous  to  have  the  ability  to  exploit  both  the  Lagrangian  and  Hamiltonian  formulations  simul- 
taneously when  handling  systems  that  involve  a mixture  of  cyclic  and  non-cyclic  coordinates.  The  equations 
of  motion  for  each  independent  generalized  coordinate  can  be  derived  independently  of  the  remaining  general- 
ized coordinates.  Thus  it  is  possible  to  select  either  the  Hamiltonian  or  the  Lagrangian  formulations  for  each 
generalized  coordinate,  independent  of  what  is  used  for  the  other  generalized  coordinates.  Routh[Roul860] 
devised  an  elegant,  and  useful,  hybrid  technique  that  separates  the  cyclic  and  non-cyclic  generalized  coor- 
dinates in  order  to  simultaneously  exploit  the  differing  advantages  of  both  the  Hamiltonian  and  Lagrangian 
formulations.  The  Routhian  reduction  approach  partitions  the  i PiQi  kinetic  energy  term  in  the  Hamil- 
tonian into  a cyclic  group,  plus  a non-cyclic  group,  i.e. 

n s n—s 

H{qi,...,qn-,Pi,-;Pn',t)  = ^PiQi  ~ L = ^ ft®  + ^ ft®  - L (8.58) 

i= 1 cyclic  noncyclic 

Routh’s  clever  idea  was  to  define  a new  function,  called  the  Routhian,  that  include  only  one  of  the  two 
partitions  of  the  kinetic  energy  terms.  This  makes  the  Routhian  a Hamiltonian  for  the  coordinates  for  which 
the  kinetic  energy  terms  are  included,  while  the  Routhian  acts  like  a negative  Lagrangian  for  the  coordinates 
where  the  kinetic  energy  term  is  omitted.  This  book  defines  two  Routhians. 


^cyclic  (Jill  ”’i  Qn  i Qli  Qsi  Ps-\-  li  • • 

III 

m 

S2  pm  - L 

cyclic 

(8.59) 

RnoncycliciSll  i • • • i Qni  Pi  i • • • i Ps  i Qs-\- 1 1 • • 

Qn’i  — 

'SS  PM  ~ L 

(8.60) 

noncyclic 


The  first,  Routhian,  called  Rcyciic,  includes  the  kinetic  energy  terms  only  for  the  cyclic  variables,  and  behaves 
like  a Hamiltonian  for  the  cyclic  variables,  and  behaves  like  a Lagrangian  for  the  non-cyclic  variables.  The 
second  Routhian,  called  Rn0n- cyclic,  includes  the  kinetic  energy  terms  for  only  the  non-cyclic  variables,  and 
behaves  like  a Hamiltonian  for  the  non-cyclic  variables,  and  behaves  like  a negative  Lagrangian  for  the  cyclic 
variables.  These  two  Routhians  complement  each  other  in  that  they  make  the  Routhian  either  a Hamiltonian 
for  the  cyclic  variables,  or  the  converse  where  the  Routhian  is  a Hamiltonian  for  the  non-cyclic  variables. 
The  Routhians  use  (®,  ®)  to  denote  those  coordinates  for  which  the  Routhian  behaves  like  a Lagrangian,  and 
( ®,ft ) for  those  coordinates  where  the  Routhian  behaves  like  a Hamiltonian.  For  uniformity,  it  is  assumed 
that  the  degrees  of  freedom  between  1 < i < s are  non-cyclic,  while  those  between  s + 1 < i < n are  ignorable 
cyclic  coordinates. 

The  Routhian  is  a hybrid  of  Lagrangian  and  Hamiltonian  mechanics.  Some  textbooks  minimize  discussion 
of  the  Routhian  on  the  grounds  that  this  hybrid  approach  is  not  fundamental.  However,  the  Routhian  is 
used  extensively  in  engineering  in  order  to  derive  the  equations  of  motion  for  rotating  systems.  In  addition 
it  is  used  when  dealing  with  rotating  nuclei  in  nuclear  physics,  rotating  molecules  in  molecular  physics,  and 
rotating  galaxies  in  astrophysics.  The  Routhian  reduction  technique  provides  a powerful  way  to  calculate 
the  intrinsic  properties  for  a rotating  system  in  the  rotating  frame  of  reference.  The  Routhian  approach  is 
included  in  this  textbook  because  it  plays  an  important  role  in  practical  applications  of  rotating  systems,  plus 
it  nicely  illustrates  the  relative  advantages  of  the  Lagrangian  and  Hamiltonian  formulations  in  mechanics. 
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8.6.1  R cyclic  - Routhian  is  a Hamiltonian  for  the  cyclic  variables 

The  cyclic  Routhian  Rcycuc  is  defined  assuming  that  the  variables  between  1 < i < s are  non-cyclic,  where 
s = n — m,  while  the  m variables  between  s + 1 < i < n are  ignorable  cyclic  coordinates.  The  cyclic  Routhian 
Rcyciic  expresses  the  cyclic  coordinates  in  terms  of  (q,p)  which  are  required  for  use  by  Hamilton’s  equations, 
while  the  non-cyclic  variables  are  expressed  in  terms  of  ( q , q)  for  use  by  the  Lagrange  equations.  That  is, 
the  cyclic  Routhian  Rcyciic  is  defined  to  be 

m 

RcycliciPl  ? •••?  Qni  Ql  5 •••5  Qs  5 Ps-\- 1 5 ••••?  Pm  = ^ ^ PiQi  R (8.61) 

cyclic 


where  the  summation  ^2cycucPiQi  is  over  only  the  m cyclic  variables  s+ 1 < i < n.  Note  that  the  Lagrangian 
can  be  split  into  the  cyclic  and  the  non-cyclic  parts 


Rcyclic^Ql  5 •••?  Qni  Ql  •>  ••••>  Qsi  Ps-\-l  5 ••••5  Pni  ^)  — ^ ^ PiQi  L cyclic  L noncyclic  (8.62) 

cyclic 

The  first  two  terms  on  the  right  can  be  combined  to  give  the  Hamiltonian  Hcycuc  for  only  the  m cyclic 
variables,  i = s + l,s  + 2,  ..,  n,  that  is 

Rcyciic  isili  • • • 5 Qn  1 Ql  ? • ••  5 Qsi  Ps-\- 1 ? • • • • •>  Pn  5 ^)  — H cyclic  Lnoncycnc  (8.63) 

The  Routhian  RCycHc{<lu  ---iQmQi,  •-•,Qs'iPs+u  --•-,Pmt)  also  can  be  written  in  an  alternate  form 

m n s 

RcycliciSll'i  •••5  Qni  Ql')  •••?  Qsi  Ps-\- 1?  ••••5 Pm  t)  = ^ ^ PiQi  L — ^ ^ PiQi  L ^ ^ PiQ  (8.64) 

cyclic  i—  1 noncyclic 


= H - Picii 

noncyclic 


(8.65) 


which  is  expressed  as  the  complete  Hamiltonian  minus  the  kinetic  energy  term  for  the  noncyclic  coordinates. 
The  Routhian  R. cyclic  behaves  like  a Hamiltonian  for  the  m cyclic  coordinates  and  behaves  like  a negative 
Lagrangian  Lnoncyciic  for  all  the  s = n—m  noncyclic  coordinates  i = 1,2, ...,  s.  Thus  the  equations  of  motion 
for  the  s non-cyclic  variables  are  given  using  Lagrange’s  equations  of  motion,  while  the  Routhian  behaves 
like  a Hamiltonian  Hcycuc  for  the  m ignorable  cyclic  variables  i = s + 1, ...,  n. 

Ignoring  both  the  Lagrange  multiplier  and  generalized  forces,  then  the  partitioned  equations  of  motion 
for  the  non-cyclic  and  cyclic  generalized  coordinates  are  given  in  Table  8.1. 


Table  8.1;  Equations  of  motion  for  the  Routhian  RcycUc 


Lagrange  equations 

Hamilton  equations 

Coordinates 

Noncyclic:  1 < i < s 

Cyclic:  (s  + 1)  < i < n 

Equations  of  motion 

9 Rcyciic  9 Lnoncyciic 

dqi  ~ dqi 

dRcyclic  ^ 

dqi 

& Rcyciic  & Lnoncyciic 

dqi  ~ dqi 

dRcyclic  X. 

dPi  — y* 

Thus  there  are  m cyclic  (ignorable)  coordinates  (q,p)s+ i,  ••••,  ( q,p)n  which  obey  Hamilton’s  equations  of 
motion,  while  the  the  first  s = n—m  non-cyclic  (non-ignorable)  coordinates  (q,  q)1 , ....,  (q,  q)s  for  * = 1,2, ...,  s 
obey  Lagrange  equations.  The  solution  for  the  cyclic  variables  is  trivial  since  they  are  constants  of  motion 
and  thus  the  Routhian  Rcycuc  has  reduced  the  number  of  equations  of  motion  that  must  be  solved  from  n to 
the  s = n — m non-cyclic  variables.  This  Routhian  provides  an  especially  useful  way  to  reduce  the  number 
of  equations  of  motion  for  rotating  systems. 

Note  that  there  are  several  definitions  used  to  define  the  Routhian,  for  example  some  books  define  this 
Routhian  as  being  the  negative  of  the  definition  used  here  so  that  it  corresponds  to  a positive  Lagrangian. 
However,  this  sign  usually  cancels  when  deriving  the  equations  of  motion,  thus  the  sign  convention  is  unim- 
portant if  a consistent  sign  convention  is  used. 
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8.6.2  R noncyclic  ~ Routhian  is  a Hamiltonian  for  the  non-cyclic  variables 

The  non-cyclic  Routhian  RnoncycUc  complements  Rcyciic ■ Again  the  generalized  coordinates  between  1 < i < 
s are  assumed  to  be  non-cyclic,  while  those  between  s + 1 < i < n are  ignorable  cyclic  coordinates.  However, 
the  expression  in  terms  of  ( q,p ) and  ( q,q ) are  interchanged,  that  is,  the  cyclic  variables  are  expressed  in 
terms  of  ( q,q ) and  the  non-cyclic  variables  are  expressed  in  terms  of  ( q,p ) which  is  opposite  of  what  was 
used  for  Rcyciic 

Rnoncyclici^Ql  5 • • ■ , Qn , Pi ; • ■ • , Ps  i 4s+l  i ? Qn  > 


It  can  be  written  in  a frequently  used  form 

Rnoncyclic^Qli  *•*>  QniPli  •“•tPs'i  — iQnil)  = ^ ' PiQi  L 

noncyclic 

m 

= H - pm* 

cyclic 

This  Routhian  behaves  like  a Hamiltonian  for  the  s non-cyclic  variables  which  are  expressed  in  terms  of  q 
and  p appropriate  for  a Hamiltonian.  This  Routhian  writes  the  m cyclic  coordinates  in  terms  of  q,  and  q , 
appropriate  for  a Lagrangian,  which  are  treated  assuming  the  Routhian  RCyCUc  is  a negative  Lagrangian  for 
these  cyclic  variables  as  summarized  in  table  8.2. 


Table  8.2;  Equations  of  motion  for  the  Routhian  Rn0ncyciic 


Hamilton  equations 

Lagrange  equations 

Coordinates 

Noncyclic:  1 < i < s 

Cyclic:  (s  + 1)  < i < n 

Equations  of  motion 

& RnoncycUc  ^ 

dRnoncyclic  dL  cyclic 

dqi  ~ Ll 

dqi  ~ dqt 

dRnoncyclic  ’ 

d RnoncycUc  d R cyclic 

dPi  ~ T 

dqi  ~ dqi 

This  non-cyclic  Routhian  RnoncycUc  is  especially  useful  since  it  equals  the  Hamiltonian  for  the  non-cyclic 
variables,  that  is,  the  kinetic  energy  for  motion  of  the  cyclic  variables  has  been  removed.  Note  that  since  the 
cyclic  variables  are  constants  of  motion,  then  RnoncycUc  is  a constant  of  motion  if  H is  a constant  of  motion. 
However,  RnoncycUc  does  not  equal  the  total  energy  since  the  coordinate  transformation  is  time  dependent, 
that  is,  RnoncycUc  corresponds  to  the  energy  of  the  non-cyclic  parts  of  the  motion.  For  example,  when  used 
to  describe  rotational  motion,  RnoncycUc  corresponds  to  the  energy  in  the  non-inertial  rotating  body-fixed 
frame  of  reference.  This  is  especially  useful  in  treating  rotating  systems  such  as  rotating  galaxies,  rotating 
machinery,  molecules,  or  rotating  strongly-deformed  nuclei  as  discussed  in  chapter  10.9. 

The  Lagrangian  and  Hamiltonian  are  the  fundamental  algebraic  approaches  to  classical  mechanics.  The 
Routhian  reduction  method  is  a valuable  hybrid  technique  that  exploits  a trick  to  reduce  the  number  of 
variables  that  have  to  be  solved  for  complicated  problems  encountered  in  science  and  engineering.  The 
Routhian  RnoncycUc  provides  the  most  useful  approach  for  solving  the  equations  of  motion  for  rotating 
molecules,  deformed  nuclei,  or  astrophysical  objects  in  that  it  gives  the  Hamiltonian  in  the  non-inertial 
body-fixed  rotating  frame  of  reference  ignoring  the  rotational  energy  of  the  frame.  By  contrast,  the  cyclic 
Routhian  Rcycuc  is  especially  useful  to  exploit  Lagrangian  mechanics  for  solving  problems  in  rigid-body 
rotation  such  as  the  Tippe  Top  described  in  example  11.14. 

Note  that  the  Lagrangian,  Hamiltonian,  plus  both  the  RnoncycUc  and  RnoncycUc  Routhian’s,  all  are  scalars 
under  rotation,  that  is,  they  are  rotationally  invariant.  However,  they  may  be  expressed  in  terms  of  the 
coordinates  in  either  the  stationary  or  a rotating  frame.  The  major  difference  is  that  the  Routhian  includes 
only  subsets  of  the  kinetic  energy  term  ^ 'ZjPjQj ■ The  relative  merits  of  using  Lagrangian,  Hamiltonian,  and 
both  the  RnoncycUc  and  Rn0ncycUc  Routhian  reduction  methods,  are  illustrated  by  the  following  examples. 


n m 

= ^2pi<ii~  L~  pA 

i= 1 cyclic 

(8.68) 


— ^ ^ PiQi  L noncyclic  L cyclic  (8.66) 

noncyclic 

— Hjioncyclic  L cyclic  (8.67) 
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8.6  Example:  Spherical  pendulum  using  Hamiltonian  mechanics 


The  spherical  pendulum  provides  a simple  test  case  for  compar- 
ison of  the  use  of  Lagrangian  mechanics,  Hamiltonian  mechanics, 
and  both  approaches  to  Routhian  reduction.  The  Lagrangian  me- 
chanics solution  of  the  spherical  pendulum  is  described  in  example 
6.10.  The  solution  using  Hamiltonian  mechanics  is  given  in  this 
example  followed  by  solutions  using  both  of  the  Routhian  reduction 
approaches. 

Consider  the  equations  of  motion  of  a spherical  pendulum  of 
mass  m and  length  b.  The  generalized  coordinates  are  6 , <f>  since 
the  length  is  fixed  at  r = b.  The  kinetic  energy  is 

1 -21  -2 

T = -mb20  H — mb2  sin2  66 
2 2 

The  potential  energy  U = — mgb  cos  9 giving  that 
1 -21  -2 

L(r,  9 , <j>,  f,  6,  <j>)  = -mb2 6 + -mb2  sin2  6<j>  + mgb  cos  6 


Spherical  pendulum 


The  generalized  momenta  are 


dL  tfi, 
pg  = — - = mb  6 

86 


dL  ,2  ■ 2 

= — - = mb  sin  < 

df> 


Since  the  system  is  conservative,  and  the  transformation  from  rectangular  to  spherical  coordinates  does  not 
depend  explicitly  on  time,  then  the  Hamiltonian  is  conserved  and  equals  the  total  energy.  The  generalized 
momenta  allow  the  Hamiltonian  to  be  written  as 

2 r>2 

H(r,  6,  (j>,pr,pe,p<i>)  = + - 0 Q - - mgbcosO 

2 mb-  2 mb2  sin  6 


The  equations  of  motion  are 


P%  cos  6 
2 mb2  sin3  6 


mgb  sin  6 


(a) 


8H 

p*=-w  = 0 

Z _dH_  pg 
dpg  mb2 


8H 


P </> 


dp $ mb2  sin2  6 

Take  the  time  derivative  of  equation  (c)  and  use  (a)  to  substitute  for  pg  gives  that 


pi  cos  6 n 

6 -■  * , + 7 sin  = 0 

m264  sin3  0 b 

Note  that  equation  (b)  shows  that  </>  is  a cyclic  coordinate.  Thus 

p, j,  = mb 2 sin2  9<f>  = constant 


(b) 

(c) 

(d) 


(e) 


that  is  the  angidar  momentum  about  the  vertical  axis  is  conserved.  Note  that  although  p $ is  a constant  of 
motion,  <j>  = e is  a function  of  9,  and  thus  in  general  it  is  not  conserved.  There  are  various  solutions 

depending  on  the  initial  conditions.  If  p<j,  = 0 then  the  pendulum  is  just  the  simple  pendulum  discussed 
previously  that  can  oscillate,  or  rotate  in  the  9 direction.  The  opposite  extreme  is  where  pg  = 0 where  the 
pendulum  rotates  in  the  (j>  direction  with  constant  9.  In  general  the  motion  is  a complicated  coupling  of  the 
9 and  f)  motions. 
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8.7  Example:  Spherical  pendulum  using  Rcycilc(r.  0,  <i>,  f,  O.  pu) 
The  Lagrangian  for  the  spherical  pendulum  is 

• • 1 • 2 1 *2 

L(r,  9,  <j>,  r,  9,  <fj)  = -mb2 9'  + — mb2  sin2  9<f  + mgb  cos  9 


Note  that  the  Lagrangian  is  independent  of  (f>,  therefore  (j)  is  an  ignorable  variable  with 


d L OH  n 

Therefore  p $ is  a constant  of  motion  equal  to 

dL  C • 2 cl 
= — - = mb  sin  9<p 

d<f> 


The  Routhian  Rcycuc(r,6,(j),r,6,pti>)  equals 


R'cyciicfr  ■,  9,  (p,  r,0,  Prjf) 


I • 2 1 *2  *2 

-mb2 9 + — mb 2 sin2  9(f)  + mgb  cos  9 — mb2  sin2  9<f> 


1 1.2  L2  , 1 Pi 

= —-mb  9 + — — o v „ 

2 2 mb 2 sin2  9 


+ mgb  cos  9 


The  Routhian  RCyciic{r^9,(f),f,6,p^)  behaves  like  a Hamiltonian  for  <f>,  and  like  a Lagrangian  L ’ = —Rcycu, 
for  9.  Use  of  Hamilton’s  canonical  equations  for  f give 


dR, 


cyclic 


dp<j,  mb 2 sin2  9 

dRcyclic  q 

P<t>  d(f> 

These  two  equations  show  that  p $ is  a constant  of  motion  given  by 

mb 2 sin2  9(f>  = p<j,  = constant 


(a) 


Note  that  the  Hamiltonian  only  includes  the  kinetic  energy  for  the  f motion  which  is  a constant  of  motion, 
but  this  energy  does  not  equal  the  total  energy.  This  is  what  is  predicted  by  Noether’s  theorem  due  to  the 
symmetry  of  the  Lagrangian  about  the  vertical  (f>  axis. 

Since  RCyciic(f,9,^>,r,6,p $)  behaves  like  a Lagrangian  for  9 then  the  Lagrange  equation  for  9 is 

\ r ^ dRcyclic  dRCyClic  n 

A,L  = ii~ o m~  = 

where  the  negative  sign  of  the  Lagrangian  in  Rcyciic{i’,9,(f),r,  9,p^)  cancels.  This  leads  to 


mb29  = 


Pi  cos  9 
mb 2 sin3  9 


— mgb  sin  9 


that  is 


P% 


2 cos  9 


rn264  sin3  0 b 


+ t sin  9 = 0 


08) 


This  result  is  identical  to  the  one  obtained  using  Lagrangian  mechanics  in  example  6.12  and  Hamiltonian 
mechanics  given  in  example  8.6.  The  Routhian  Rcycuc  simplified  the  problem  to  one  degree  of  freedom  9 
by  absorbing  into  the  Hamiltonian  the  cyclic,  that  is,  ignorable,  (f>  coordinate  and  its  conserved  conjugate 
momentum  p Note  that  the  central  term  in  equation  f3  is  the  centrifugal  term  which  is  due  to  rotation 
about  the  vertical  axis.  This  term  is  zero  for  plane  pendulum  motion  when  p$  = 0. 
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8.8  Example:  Spherical  pendulum  using  RnoncycUcir,  0,  0,  [>,■■  [>()■  4>) 


For  a rotational  system  the  Routhian  RnoncycUc(r , 8,  <t>,Pr,P9,  <t>)  a^so  can  be  used  to  project  out  the  Hamil- 
tonian for  the  active  variables  in  the  rotating  body-fixed  frame  of  reference.  Consider  the  spherical  pendulum 
where  the  rotating  frame  is  rotating  with  angular  velocity  </>.  The  Lagrangian  for  the  spherical  pendidum  is 

• • 1 • 2 1 ‘2 

L(r,  6,  <j>,  r,  8,  <f>)  = -mb2 8 + — mb2  sin2  8<j>  + mgb cos  8 

Note  that  the  Lagrangian  is  independent  of  therefore  cj>  is  an  ignorable  variable  with 

BL  3H  n 

Therefore  p $ is  a constant  of  motion  equal  to 


OL  2 . 2 
p,p  = — - = mb  sin  ( 
dtp 


The  total  Hamiltonian  is  given  by 

H{r,8,<j>,pr,pe,P4, ) = Yp^  -L  = 


Ps 


— mgb  cos  8 


2 mb2  2 mb'2  sin2  8 

l 

The  Routhian  for  the  rotating  frame  of  reference  Hrot  is  given  by  equation  8.68,  that  is 


Rnoncyclici [ r,  8,  (t),pr,p0,(/)) 


= Y Pi^  ~ - L = H - 

i—l 


Pe 

2 mb2 


Pi 


2 mb2  sin2  8 

1 .0  . 9 ^ ",  2 


mgb  cos  8 — p^f) 


2 mb2  2 


8(f  —mgb  cos  8 


(7) 


This  behaves  like  a negative  Lagrangian  for  cj)  and  a Hamiltonian  for  8.  The  conjugate  momenta  are 

dRnoncyclic  = mb2  ^ 2^ 

dj) 

d it, 


dL 

p<l>  = 

d(j) 

dL 

i><t>  = 

d<j> 

that  is,  Ptj,  is  a constant  of  motion. 
Hamilton’s  equations  of  motion  give 


~Pe  = 

Equation  5 gives  that 

Inserting  this  into  equation  e gives 


dRnoncyclic 

dpg 

dRnoncyclic 

d8 


^ noncyclic  ^ 

~df> 


Pe 
mb 2 


2 cos  8 


P% 

mb2  sin3  8 


mgb  sin  8 


{5) 

(e) 


at  mbz 


, COS0 


i/  — — + -r  sin  0 = 0 

m264sin30  b 

which  is  identical  to  the  equation  of  motion  a derived  using  RCyciic ■ The  Hamiltonian  in  the  rotating  frame 
is  a constant  of  motion  given  by  7 ,but  it  does  not  include  the  total  energy. 

Note  that  these  examples  show  that  both  forms  of  the  Routhian,  as  well  as  the  complete  Lagrangian 
formalism,  shown  in  example  6.12,  and  complete  Hamiltonian  formalism,  shown  in  example  8.6,  all  give  the 
same  equations  of  motion.  This  illustrates  that  the  Lagrangian,  Hamiltonian,  and  Routhian  mechanics  all 
give  the  same  equations  of  motion  and  this  applies  both  in  the  static  inertial  frame  as  well  as  a rotating  frame 
since  the  Lagrangian,  Hamiltonian  and  Routhian  all  are  scalars  under  rotation,  that  is,  they  are  rotationally 
invariant. 
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8.9  Example:  Single  particle  moving  in  a vertical  plane  under  the  influence  of 
an  inverse-square  central  force 

The  Lagrangian  for  a single  particle  of  mass  m,  moving  in  a vertical  plane  and  subject  to  a central  inverse 
square  central  force,  is  specified  by  two  generalized  coordinates,  r,  and  9. 


k 

r 


The  ignorable  coordinate  is  9,  since  it  is  cyclic.  Let  the  constant  conjugate  momentum  be  denoted  by  pe  = 
= mr29.  Then  the  corresponding  cyclic  Routhian  is 

Oo 

Rcyciic(r,9,f,pe)  = pq9  — L = - -mr2  - - 

This  Routhian  is  the  equivalent  one- dimensional  potential  U(r)  minus  the  kinetic  energy  of  radial  motion. 
Applying  Hamilton’s  equation  to  the  cyclic  coordinate  9 gives 


Pe  = 0 


Pe 
mr 2 


= 9 


implying  a solution 

pe  = mr29  = l 

where  the  angular  momentum  l is  a constant. 

The  Lagrange-Euler  equation  can  be  applied  to  the  non-cyclic  coordinate  r 

* r ^ dRcyclic  dRcyclic  ~ 

r ~Jt~dk  dr 

where  the  negative  sign  of  Rcyciic  cancels.  This  leads  to  the  radial  solution 


mr  — 


where  pe  = l which  is  a constant  of  motion  in  the  centrifugal  term.  Thus  the  problem  has  been  reduced  to  a 
one- dimensional  problem  in  radius  r that  is  in  a rotating  frame  of  reference. 


8.7  Dissipative  dynamical  systems 

Dissipative  drag  forces  are  non-conservative  and  usually  are  velocity  dependent.  Chapter  4 showed  that 
the  motion  of  non-linear  dissipative  dynamical  systems  can  be  highly  sensitive  to  the  initial  conditions  and 
can  lead  to  chaotic  motion.  In  spite  of  the  complications  that  can  be  introduced  by  energy  dissipation, 
it  is  possible  to  use  variational  methods  to  incorporate  energy  dissipation  in  dynamical  systems  via  the 
following  three  different  approaches.  (1)  Explicitly  introduce  the  dissipative  force  as  a generalized  force 
in  the  Lagrangian  or  Hamiltonian  mechanics.  (2)  Use  Rayleigh’s  dissipation  function  when  the  dissipation 
forces  depend  linearly  on  velocity.  (3)  Use  non-standard  Lagrangians,  or  the  corresponding  Hamiltonians, 
that  incorporate  dissipation  directly  in  the  Lagrangian  or  Hamiltonian  as  discussed  in  chapter  13. 

8.7.1  Generalized  drag  force 

The  most  straightforward  approach  for  handling  dissipative  forces  is  to  include  the  dissipative  drag  force 
explicitly  as  a generalized  drag  force  in  the  Euler-Lagrange  equations.  The  drag  force  can  have  any  functional 
dependence  on  velocity,  position,  or  time. 


jpdrag  _ /(q,  q,  t)v  (8.69) 

Note  that  since  the  drag  force  is  dissipative  the  dominant  component  of  the  drag  force  must  point  in  the 
opposite  direction  to  the  velocity  vector.  For  example,  for  a simple  linear  velocity  dependence  the  generalized 
drag  force  could  be  of  the  form  Qfxc  = —fivj. 
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8.7.2  Rayleigh’s  dissipation  function 

Dissipative  forces  for  fluids  and  gases  depend  linearly  on  velocity  at  Reynolds  numbers  Re  < 1, [7,  linear- 
velocity  energy  dissipation  ] that  is,  for  low  velocities.  Such  linear-velocity  dissipative  forces  occur  frequently 
in  nature.  The  wide  range  of  electrical  conductors  that  obey  Ohm’s  Law  is  an  example  of  a dissipative  force 
that  depends  linearly  on  velocity.  Systems  involving  small  amplitude  oscillations  at  low  velocities  are  other 
examples  where  the  dissipation  depends  linearly  on  velocity.  Such  linear  dissipative  systems  have  a dissipative 
force  of  the  form  F-f  = — b(x,  y,  z)v  where  the  dissipation  coefficient  b{x,  y,  z ) is  velocity  independent  and  may 
have  different  values  along  different  axes.  Dissipative  forces  that  depend  linearly  on  velocity  can  be  absorbed 
directly  into  the  Lagrange  equations  by  expressing  the  vector  frictional  force  F in  terms  of  a scalar  function 
of  the  generalized  coordinates  called  the  Rayleigh  dissipation  function  T as  proposed  by  Lord  Rayleigh.  The 
Rayleigh  dissipation  function  is  a useful  way  for  including  linear  dissipative  forces  in  both  Lagrangian  and 
Hamiltonian  mechanics  as  shown  below. 


Lagrangian  mechanics 

Consider  n equations  of  motion  for  the  n degrees  of  freedom,  and  assume  that  the  dissipation  depends  linearly 
on  velocity.  Then,  allowing  all  possible  cross  coupling  of  the  equations  of  motion  for  qj , the  equations  of 
motion  can  be  written  in  the  form 

n 

^ ) [iTiijQj  ' bijqj  + cr  j (jj  Cf  i (h)}  — 0 (8.70) 

i= 1 

Multiplying  equation  8.70  by  cp  , take  the  time  integral,  and  sum  over  i,j,  gives  the  following  energy  equation 

n n n n ,.f  n n ~t  n ,.£ 

EE  niijCjCdt  + EE  / bijqjqidt  + EE  / cijQj9idt  = E / Qi(t)qidt  (8.71) 

i=  lj=lJ°  i—1  j — 1 ® i J° 

The  right-hand  term  is  the  total  energy  supplied  to  the  system  by  the  external  generalized  forces  Qi(t) 
during  the  time  t.  The  first  time-integral  term  on  the  left-hand  side  is  the  total  kinetic  energy,  while  the 
third  integral  term  equals  the  potential  energy.  The  second  integral  term  on  the  left  equals  2T  where  T is 
defined  as 

^ n n 

^EEJ«  (8-72) 

i=  1 j=l 

and  the  summations  are  over  all  n particles  of  the  system.  This  definition  allows  for  complicated  cross- 
coupling effects  between  the  n particles.  Fortunately  the  particle-particle  coupling  effects  usually  can  be 
neglected  allowing  use  of  the  simpler  definition  that  includes  only  the  diagonal  terms.  Then  the  diagonal 
form  of  the  Rayleigh  dissipation  function  can  be  written  as 

1 n 

T=~2  E6^  (8-73) 

The  frictional  force  in  the  qi  direction  is  given  by 

Fl  = “I?  = ^ (8.74) 

which  depends  linearly  on  velocity  q.\ . In  general,  the  dissipative  force  is  the  velocity  gradient  of  the  Rayleigh 
dissipation  function, 

F/  = V (8.75) 

Note  that  the  physical  significance  of  the  Rayleigh  dissipation  function  is  illustrated  by  calculating  the 
work  done  by  one  particle  i against  friction,  which  is 


Therefore 


dW(  = -F{  ■ dr  = -F{  ■ c Udt  = biqfdt 


2T  = 


dWf 


(8.76) 


dt 


(8.77) 
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which  is  the  rate  of  energy  (power)  loss  due  to  the  dissipative  forces  involved.  The  same  relation  is  obtained 
after  summing  over  all  the  particles  involved. 

Transforming  the  frictional  force  into  generalized  coordinates  requires  the  relation 


. \ - dr  i . drt 


(8.78) 


Note  that  the  derivative  with  respect  to  <jk  equals 

drj  _ drt 
dqj  ~ dq3 

Using  equations  6.17  and  6.47  the  j component  of  the  generalized  frictional  force  Qj  is  given  by 


(8.79) 


7T,  r\  TL  r\  • Tl  ry  • 


dT 

dq3 


(8.80) 


i=l  ^ J i=  1 i=  1 

Thus  the  Lagrange  equations  6.47  can  be  written  including  the  Rayleigh  dissipation  function  in  the  form 


(8.81) 


Where  Qfxc  corresponds  to  the  generalized  forces  remaining  after  removal  of  the  generalized  linear,  velocity- 

dependent,  frictional  force  Qj,  and  the  holonomic  forces  of  constraint  are  absorbed  into  the  Lagrange  mul- 
tiplier term. 

Linear  dissipative  forces  can  be  directly,  and  elegantly,  included  in  Lagrangian  mechanics  by  use  of 
Rayleigh’s  dissipation  function.  Equation  8.81  facilitates  solving  the  equations  of  motion  when  linear  velocity- 
dependent  dissipative  forces  are  acting  on  the  system. 


Hamiltonian  mechanics 

If  the  nonconservative  forces  depend  linearly  on  velocity,  and  are  derivable  from  Rayleigh’s  dissipation 
function  according  to  equation  8.81,  then  using  the  definition  of  generalized  momentum  gives 


Pi 


Pi 


d dL  _ dL_ 
dt  dqj  dqi 


III  O 

,t)  + QfX° 

j ti  dq> 


dH(p,q,t) 

dqi 


m 


dgu 

dqj 


(q  H)  + QfXC 


dT 

dqj 

dT_ 

dQj 


Thus  Hamilton’s  equations  become 


Qi 


Pi 


dH 

dpi 

dH 

dqi 


lk=l 


dgk 

dqj 


(q  H)  + QfXC 


dT 

dqj 


(8.82) 

(8.83) 


(8.84) 

(8.85) 


The  Rayleigh  dissipation  function  provides  an  elegant  and  convenient  way  to  account  for  the  frequently 
encountered  special  case  of  linear  dissipative  forces  in  Lagrangian  and  Hamiltonian  mechanics.  The  following 
two  examples  illustrate  the  usefulness  of  the  Rayleigh  dissipation  function  when  applied  to  both  classical 
mechanics  and  electromagnetism. 
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8.10  Example:  Driven,  linearly- damped,  coupled  linear  oscillators 


Consider  the  two  identical,  linearly  damped,  coupled 
oscillators  (damping  constant  (3)  shown  in  the  figure.  A 
periodic  force  F = F0  cos(< cot)  is  applied  to  the  left-hand 
mass  m.  The  kinetic  energy  of  the  system  is 

T = \m(x\  + ±l) 

The  potential  energy  is 


K m k'  m K 

Harmonically-driven,  linearly-damped,  coupled 
linear  oscillators. 


U = —KX 
2 


+ -Kx\  + -K1  (x2  - XiY 


- (k  + k')  x\  + - (k  + k')  x\  — k'x\X2 


Thus  the  Lagrangian  equals 


\m(x\ 


x2) 


i (k  + K ')  xf  + i (k  + k')  x\  — k'x\X2 


Since  the  damping  is  linear,  it  is  possible  to  use  the  Rayleigh  dissipation  function 

E=\p(xi+xl) 

The  applied  generalized  forces  are 


Q[  = Fa  cos  (cot) 


Q2  = 0 


Use  the  Euler- Lagrange  equations  8.81  to  derive  the  equations  of  motion 


I d_  (d L \ _ dL\ 
\dt\dqj  dq:l  J 


— = Q'. 
dq, 


k= 1 


dgu , ,, 
k-^—  (q,  t) 

dqj 


gives 


mxi  + fix i + (k  + n')xi  — k'x2  = F0  cos  (cot) 
mx2  + fix  2 + (k  + k')x2  — n'x\  = 0 


These  two  coupled  equations  can  be  decoupled  and  simplified  by  making  a transformation  to  normal  coor- 
dinates, , r}2  where 

?7i  = xi  - x2  ??2  = X\  + x2 

Thus 

xi  = l(V!+V2)  X2  = \(rl2-rll) 

Insert  these  into  the  equations  of  motion  gives 

m-(Vi+V2)  + fi(Vi+V2)  + («  + k')(Vi+V2)  ~ k'(t)2-  pfi)  = 2F0cos(wt) 

^12^111)  + fi(V2^Vi)  + ^ + ^)(V2-Vi)-^(Vi+V2)  = 0 


Add  and  subtract  these  two  equations  gives  the  following  two  decoupled  equations 


fi  . (k  + 2k') 

Vi  + —Vi  + Vi 

m m 

fi  . K 

r\-i  + —V-2  + —V2 
m m 


Fo  , ,, 

— cos  hot) 
m 

Fo  , , 

— cos  (cot) 
m 


Define  T 


= 


^+mK  \UJ2  = A = Fff.  Then  the  two  independent  equations  of  motion  become 


fh  + r??1  + co\rh  = Hcos  (cot) 


7)2  + Tr/2  + co2r)2  = A cos  (cot) 
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This  solution  is  a superposition  of  two  independent,  linearly -damped,  driven  normal  modes  r]1  and  t]2  that 
have  different  natural  frequencies  u>i  and  1x2-  For  weak  damping  these  two  driven  normal  modes  each  undergo 

damped  oscillatory  motion  with  the  r ]1  and  rj2  normal  modes  exhibiting  resonances  at  = J txf  — 2 
and  u> 2 = \jw\  — 2 (^)2 


8.11  Example:  Kirchhoff’s  rules  for  electrical  circuits 

The  mathematical  equations  governing  the  behavior  of  mechanical  systems  and  LRC  electrical  circuits 
have  a close  similarity.  Thus  variational  methods  can  be  used  to  derive  the  analogous  behavior  for  electrical 
circuits.  For  example,  for  a system  of  n separate  circuits,  the  magnetic  flux  through  circuit  i,  due  to 
electrical  current  Ik  = <jk  flowing  in  circuit  k,  is  given  by 

*&ik  = Mikfk 

where  Mikis  the  mutual  inductance.  The  diagonal  term  Mu  = i;  corresponds  to  the  self  inductance  of 
circuit  i.  The  net  magnetic  flux  <!>*  through  circuit  i , due  to  all  n circuits,  is  the  sum 


— ^2  Mikqk 


k=  1 

Thus  the  total  magnetic  energy  Wmag,  which  is  analogous  to  kinetic  energy  T,  is  given  by  summing  over  all 
n circuits  to  be 

^ n n 

wmag  = T=-YJY.  Mikqi  c[k 

i= 1 k— 1 

Similarly  the  electrical  energy  Weiect  stored  in  the  mutual  capacitance  Cik  between  the  n circuits,  which 
is  analogous  to  potential  energy,  U,  is  given  by 


n n 

w«-  = u = l'£Emr 


2^^  Cik 

i= 1 k= 1 


Thus  the  standard  Lagrangian  for  this  electric  system  is  given  by 


1 


i=  1 1 L 


MikQi  (fk 


qrfk 


a 


ik 


{a) 


Assuming  that  Ohm’s  Law  is  obeyed,  that  is,  the  dissipation  force  depends  linearly  on  velocity,  then  the 
Rayleigh  dissipation  function  can  be  written  in  the  form 


=2  ^ ^ ^ ^ RikQiQk 
i—1  k—  1 

where  Rik  is  the  resistance  matrix.  Thus  the  dissipation  force,  expressed  in  volts,  is  given  by 

« = -f  = ?!>» 


W) 


(7) 


fe= 1 


Inserting  equations  a,  (3,  and  7 into  equation  8.81,  plus  making  the  assumption  that  an  additional  gen- 
eralized electrical  force  Qi  = tfflt)  volts  is  acting  on  circuit  i,  then  the  Euler- Lagrange  equations  give  the 
following  equations  of  motion. 


£ 

k— 1 L 


Qk 

Mikqk  + Rikqk  + -pz — 

k^ik 


= &(*) 


This  is  a generalized  version  of  Kirchhojf’s  loop  ride  which  can  be  seen  by  considering  the  case  where  the 
diagonal  term  i = k is  the  only  non-zero  term.  Then 


Maifi  + Radi  + -f- 


= £<(*) 


This  sum  of  the  voltages  is  identical  to  the  usual  expression  for  Kirchhoff’s  loop  rule.  This  example 
illustrates  the  power  of  variational  methods  when  applied  to  fields  beyond  classical  mechanics. 
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8.8  Summary 


Hamilton’s  equations  of  motion 

Inserting  the  generalized  momentum  into  Jacobi’s  generalized  energy  relation  was  used  to  define  the 
Hamiltonian  function  to  be 

H (q,  p,  t)  = p • q-L(q,  q,  t)  (8.3) 

The  Legendre  transform  of  the  Lagrange-Euler  equations,  led  to  Hamilton’s  equations  of  motion. 


. _ d H 
Qj  ~ dPj 


dH 

dqj 


Pi  = 


m n 

tl  d * 


Q 


EXC 


The  generalized  energy  equation  7.38  gives  the  time  dependence 
dH(q,p,t) 


dt 


£ 


ti 


Q 


EXC 

j 


<U 


dL{ q,  q ,t) 
dt 


(8.25) 

(8.26) 


(8.27) 


where 


dH  dL 
dt  dt 


(8.24) 


The  pk,  qk  are  treated  as  independent  canonical  variables.  Lagrange  was  the  first  to  derive  the  canonical 
equations  but  he  did  not  recognize  them  as  a basic  set  of  equations  of  motion.  Hamilton  derived  the  canonical 
equations  of  motion  from  his  fundamental  variational  principle  and  made  them  the  basis  for  a far-reaching 
theory  of  dynamics.  Hamilton’s  equations  give  2s  first-order  differential  equations  for  Pk,qk  for  each  of  the 
s degrees  of  freedom.  Lagrange’s  equations  give  s second-order  differential  equations  for  the  variables  quAk- 
Routhian  reduction  technique 

The  Routhian  reduction  technique  is  a hybrid  of  Lagrangian  and  Hamiltonian  mechanics  that  exploits 
the  advantages  of  both  approaches  for  solving  problems  involving  cyclic  variables.  It  is  especially  useful  for 
solving  motion  in  rotating  systems  in  science  and  engineering.  Two  Routliians  are  used  frequently  for  solving 
the  equations  of  motion  of  rotating  systems.  Assuming  that  the  variables  between  1 < i < s are  non-cyclic, 
while  the  m variables  between  s + 1 < i < n are  ignorable  cyclic  coordinates,  then  the  two  Routhians  are: 


RcyclicA  1 , • • • , qn  i qi:  • ••  5 1 Ps+1 1 j Vn  it')  ^ ' Pi  Qi  L H ^ ' Pi  Qi  (8.65) 

cyclic  noncyclic 

s m 

Rnoncyclic(,Ql  > •••?  Qni  Pi  5 •••5  Psi  Qs-\- 1 5 ••••5  ^ ^ PiQi  L — H ^ ^ PiQi  (8.68) 

noncyclic  cyclic 

The  Routhian  Rcycuc  is  a negative  Lagrangian  for  the  non-cyclic  variables  between  1 < i < s,  where 
s = n — m,  and  is  a Hamiltonian  for  the  m cyclic  variables  between  s + 1 < i < n.  Since  the  cyclic 
variables  are  constants  of  the  Hamiltonian,  their  solution  is  trivial,  and  the  number  of  variables  included  in 
the  Lagrangian  is  reduced  from  n to  s = n — m.  The  Routhian  RCyCiic  is  useful  for  solving  some  problems  in 
classical  mechanics.  The  Routhian  Rn0ncyclic  is  a Hamiltonian  for  the  non-cyclic  variables  between  1 < i < s, 
and  is  a negative  Lagrangian  for  the  m cyclic  variables  between  s + 1 < i < n.  Since  the  cyclic  variables 
are  constants  of  motion,  the  Routhian  Rn0ncyclic  also  is  a constant  of  motion  but  it  does  not  equal  the  total 
energy  since  the  coordinate  transformation  is  time  dependent.  The  Routhian  Rn0ncyciic  is  especially  valuable 
for  solving  rotating  many-body  systems  such  as  galaxies,  molecules,  or  nuclei,  since  the  Routhian  Rn0ncyciic 
is  the  Hamiltonian  in  the  rotating  body-fixed  coordinate  frame. 

Dissipative  systems: 

There  are  three  different  approaches  to  Lagrangian  or  Hamiltonian  mechanics  that  can  be  used  to  derive 
the  equations  of  motion  for  dissipative  systems.  The  first,  and  most  straightforward  approach,  is  to  introduce 
the  drag  force  as  a generalized  force  in  the  Euler-Lagrange  equations.  The  second  approach  uses  Rayleigh’s 
dissipation  scalar  function  T which  applies  when  drag  forces  depend  linearly  on  velocity.  If  the  dissipative 
force  can  be  expressed  as 


F/  = -Vvf 


(8.75) 
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then  the  Lagrange  equations  can  be  written  in  terms  of  the  Rayleigh  dissipation  function  as 


The  corresponding  Hamiltonian  relations  are 


Qi 


Pi 


dH 


dpi 


dH 

dqi  + 


m 


£ 


d. r 

dqj 


(8.81) 


(8.84) 

(8.85) 


The  third  approach,  discussed  in  chapter  13.7,  uses  non-standard  Lagrangians  or  Hamiltonians  that  are 
derived  from  the  required  equations  of  motion  using  the  inverse  variational  problem. 

Comparison  of  Lagrangian  and  Hamiltonian  mechanics 

Lagrangian  and  the  Hamiltonian  dynamics  are  two  powerful  and  related  algebraic  formulations  of  me- 
chanics that  are  based  on  the  same  variational  principle.  They  both  concentrate  solely  on  active  forces  and 
can  ignore  internal  forces.  They  can  handle  many-body  systems  and  allow  convenient  generalized  coordinates 
of  choice,  which  is  impractical  or  impossible  using  Newtonian  mechanics.  Thus  it  is  natural  to  compare  the 
relative  advantages  of  these  two  algebraic  formalisms  in  order  to  decide  which  should  be  used  for  a specific 
problem. 

For  a system  with  n generalized  coordinates,  plus  m constraint  forces  that  are  not  required  to  be  known, 
then  the  Lagrangian  approach,  using  a minimal  set  of  generalized  coordinates,  reduces  to  only  s = n — m 
second-order  differential  equations  and  unknowns  compared  to  the  Newtonian  approach  where  there  are 
n + m unknowns.  Alternatively,  use  of  Lagrange  multipliers  allows  determination  of  the  constraint  forces 
resulting  in  n + m second  order  equations  and  unknowns.  The  Lagrangian  potential  function  is  limited 
to  conservative  forces,  Lagrange  multipliers  can  be  used  to  handle  holonomic  forces  of  constraint,  while 
generalized  forces  can  be  used  to  handle  non-conservative  and  non-holonomic  forces.  The  advantage  of  the 
Lagrange  equations  of  motion  is  that  they  can  deal  with  any  type  of  force,  conservative  or  non-conservative, 
and  they  directly  determine  q,  q rather  than  q,  p which  then  requires  relating  p to  q. 

For  a system  with  n generalized  coordinates,  the  Hamiltonian  approach  determines  2 n first-order  differ- 
ential equations  which  are  easier  to  solve  than  second-order  equations.  But  the  2 n solutions  then  must  be 
combined  to  determine  the  equations  of  motion.  The  Hamiltonian  approach  is  superior  to  the  Lagrange  ap- 
proach in  its  ability  to  obtain  an  analytical  solution  of  the  integrals  of  the  motion.  Hamiltonian  dynamics  also 
has  a means  of  determining  the  unknown  variables  for  which  the  solution  assumes  a soluble  form.  Important 
applications  of  Hamiltonian  mechanics  are  to  quantum  mechanics  and  statistical  mechanics,  where  quantum 
analogs  of  g.;  and  Pi,  can  be  used  to  relate  to  the  fundamental  variables  of  Hamiltonian  mechanics.  This 
does  not  apply  for  the  variables  g,;  and  g*  of  Lagrangian  mechanics.  The  Hamiltonian  approach  is  especially 
powerful  when  the  system  has  m cyclic  variables,  then  the  m conjugate  momenta  pi  are  constants.  Thus  the 
m conjugate  variables  ( qi,Pi ) can  be  factored  out  of  the  Hamiltonian,  which  reduces  the  number  of  conjugate 
variables  required  to  n — to.  This  is  not  possible  using  the  Lagrangian  approach  since,  even  though  the  m 
coordinates  qi  can  be  factored  out,  the  velocities  qi  still  must  be  included,  thus  the  n conjugate  variables 
must  be  included.  The  Lagrange  approach  is  advantageous  for  obtaining  a numerical  solution  of  systems  in 
classical  mechanics.  However,  Hamiltonian  mechanics  expresses  the  variables  in  terms  of  the  fundamental 
canonical  variables  (q,  p)  which  provides  a more  fundamental  insight  into  the  underlying  physics.1 


1 Recommended  reading:  "Classical  Mechanics"  H.  Goldstein,  Addison-Wesley,  Reading  (1950).  The  present  chapter 
closely  follows  the  notation  used  by  Goldstein  to  facilitate  cross-referencing  and  reading  the  many  other  textbooks  that  have 
adopted  this  notation. 
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Workshop  exercises 

1.  A block  of  mass  to  rests  on  an  inclined  plane  making  an  angle  6 with  the  horizontal.  The  inclined  plane  (a 
triangular  block  of  mass  M)  is  free  to  slide  horizontally  without  friction.  The  block  of  mass  TO  is  also  free  to 
slide  on  the  larger  block  of  mass  M without  friction. 

(a)  Construct  the  Lagrangian  function. 

(b)  Derive  the  equations  of  motion  for  this  system. 

(c)  Calculate  the  canonical  momenta. 

(d)  Construct  the  Hamiltonian  function. 

(e)  Find  which  of  the  two  momenta  found  in  part  (c)  is  a constant  of  motion  and  discuss  why  it  is  so.  If  the 
two  blocks  start  from  rest,  what  is  the  value  of  this  constant  of  motion? 

2.  Discuss  among  yourselves  the  following  four  conditions  that  can  exist  for  the  Hamiltonian  and  give  several 
examples  of  systems  exhibiting  each  of  the  four  conditions. 

(a)  The  Hamiltonian  is  conserved  and  equals  the  total  mechanical  energy 

(b)  The  Hamiltonian  is  conserved  but  does  not  equal  the  total  mechanical  energy 

(c)  The  Hamiltonian  is  not  conserved  but  does  equal  the  total  mechanical  energy 

(d)  The  Hamiltonian  is  not  conserved  and  does  not  equal  the  mechanical  total  energy. 

3.  A block  of  mass  TO  rests  on  an  inclined  plane  making  an  angle  9 with  the  horizontal.  The  inclined  plane  (a 
triangular  block  of  mass  M)  is  free  to  slide  horizontally  without  friction.  The  block  of  mass  TO  is  also  free  to 
slide  on  the  larger  block  of  mass  M without  friction. 

(a)  Construct  the  Lagrangian  function. 

(b)  Derive  the  equations  of  motion  for  this  system. 

(c)  Calculate  the  canonical  momenta. 

(d)  Construct  the  Hamiltonian  function. 

(e)  Find  which  of  the  two  momenta  found  in  part  (c)  is  a constant  of  motion  and  discuss  why  it  is  so.  If  the 
two  blocks  start  from  rest,  what  is  the  value  of  this  constant  of  motion? 

4.  Discuss  among  yourselves  the  following  four  conditions  that  can  exist  for  the  Hamiltonian  and  give  several 
examples  of  systems  exhibiting  each  of  the  four  conditions. 

a)  The  Hamiltonian  is  conserved  and  equals  the  total  mechanical  energy 

b)  The  Hamiltonian  is  conserved  but  does  not  equal  the  total  mechanical  energy 

c)  The  Hamiltonian  is  not  conserved  but  does  equal  the  total  mechanical  energy 

d)  The  Hamiltonian  is  not  conserved  and  does  not  equal  the  mechanical  total  energy 

5.  Compare  the  Lagrangian  formalism  and  the  Hamiltonian  formalism  by  creating  a two-column  chart.  Label  one 
side  “Lagrangian”  and  the  other  side  “Hamiltonian”  and  discuss  the  similarities  and  differences.  Here  are  some 
ideas  to  get  you  started: 

• What  are  the  basic  variables  in  each  formalism? 

• What  are  the  form  and  number  of  the  equations  of  motion  derived  in  each  case? 

• How  does  the  Lagrangian  “state  space”  compare  to  the  Hamiltonian  “phase  space”? 

6.  It  can  be  shown  that  if  L(q,  q,  t)  is  the  Lagrangian  of  a particle  moving  in  one  dimension,  then  L = L'  where 
L'(q,q,t)  = L(q,q,t)  + ^ and  f(q,t ) is  an  arbitrary  function.  This  problem  explores  the  consequences  of 
this  on  the  Hamiltonian  formalism. 
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(a)  Relate  the  new  canonical  momentum  p , for  L' , to  the  old  canonical  momentum  p,  for  L. 

(b)  Express  the  new  Hamiltonian  H' (q' ,p' ,t)  for  L'  in  terms  of  the  old  Hamiltonian  H(q,p,t)  and  /. 

(c)  Explicitly  show  that  the  new  Hamilton’s  equations  for  H'  are  equivalent  to  the  old  Hamilton’s  equations 
for  H . 

7.  A massless  hoop  of  radius  R is  rotating  about  an  axis  perpendicular  to  its  central  axis  at  constant  angular 
velocity  w.  A mass  m can  freely  slide  around  the  hoop. 

(a)  Determine  the  Lagrangian  of  the  system. 

(b)  Determine  the  Hamiltonian  of  the  system.  Does  it  equal  the  total  mechanical  energy? 

(c)  Determine  the  Lagrangian  of  the  system  with  respect  to  a coordinate  frame  in  which  H = T + T4jj.  What 
is  Vegl  What  force  generates  the  additional  term  in  Veg? 

8.  Consider  a pendulum  of  length  L attached  to  the  end  of  rod  of  length  R.  The  rod  is  rotating  at  constant 
angular  velocity  u in  the  plane.  Assume  the  pendulum  is  always  taut. 

(a)  Determine  equations  of  motion. 

(b)  For  what  value  of  UJ~R  is  this  system  the  same  as  a plane  pendulum  in  a constant  gravitational  field? 

(c)  Show  H ^ E.  What  is  the  reason? 

Problems 

1)  A particle  of  mass  m in  a gravitational  field  slides  on  the  inside  of  a smooth  parabola  of  revolution  whose  axis  is 
vertical.  Using  the  distance  from  the  axis  r,  and  the  azimuthal  angle  <p  as  generalized  coordinates,  find  the  following. 

a)  The  Lagrangian  of  the  system. 

b)  The  generalized  momenta  and  the  corresponding  Hamiltonian 

c)  The  equation  of  motion  for  the  coordinate  r as  a function  of  time. 

d)  If  Ta"  = show  that  the  particle  can  execute  small  oscillations  about  the  lowest  point  of  the  paraboloid  and 
find  the  frequency  of  these  oscillations. 

2)  Consider  a particle  of  mass  TO  which  is  constrained  to  move  on  the  surface  of  a sphere  of  radius  R.  There  are  no 
external  forces  of  any  kind  acting  on  the  particle. 

a)  What  is  the  number  of  generalized  coordinates  necessary  to  describe  the  problem? 

b)  Choose  a set  of  generalized  coordinates  and  write  the  Lagrangian  of  the  system. 

c)  What  is  the  Hamiltonian  of  the  system?  Is  it  conserved? 

d)  Prove  that  the  motion  of  the  particle  is  along  a great  circle  of  the  sphere. 

3.  A block  of  mass  to  is  attached  to  a wedge  of  mass  M by  a spring  with  spring  constant  k.  The  inclined  frictionless 
surface  of  the  wedge  makes  an  angle  a to  the  horizontal.  The  wedge  is  free  to  slide  on  a horizontal  frictionless  surface 
as  shown  in  the  figure. 

a)  Given  that  the  relaxed  length  of  the  spring  is  d,  find  the  values  So  when  both  book  and  wedge  are  stationary. 

b)  Find  the  Lagrangian  for  the  system  as  a function  of  the  x coordinate  of  the  wedge  and  the  length  of  spring  s. 
Write  down  the  equations  of  motion. 

c)  What  is  the  natural  frequency  of  vibration? 
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4.  A fly-ball  governor  comprises  two  masses  m connected  by  4 hinged  arms  of  length  l to  a vertical  shaft  and  to  a 
mass  M which  can  slide  up  or  down  the  shaft  without  friction  in  a uniform  vertical  gravitational  field  as  shown  in 
the  figure.  The  assembly  is  constrained  to  rotate  around  the  axis  of  the  vertical  shaft  with  same  angular  velocity  as 
that  of  the  vertical  shaft.  Neglect  the  mass  of  the  arms,  air  friction,  and  assume  that  the  mass  M has  a negligible 
moment  of  inertia.  Assume  that  the  whole  system  is  constrained  to  rotate  with  a constant  angular  velocity  ui o- 

a)  Choose  suitable  coordinates  and  use  the  Lagrangian  to  derive  equations  of  motion  of  the  system  around  the 
equilibrium  position. 

b)  Determine  the  height  z of  the  mass  M above  its  lowest  position  as  a function  of  Wo- 

c)  Find  the  frequency  of  small  oscillations  about  this  steady  motion. 

d)  Derive  a Routliian  that  provides  the  Hamiltonian  in  the  rotating  system. 

e)  Is  the  total  energy  of  the  fly-ball  governor  in  the  rotating  frame  of  reference  constant  in  time? 

f)  Suppose  that  the  shaft  and  assembly  are  not  constrained  to  rotate  at  a constant  angular  velocity  uiq,  that  is, 
it  is  allowed  to  rotate  freely  at  angular  velocity  (p.  What  is  the  difference  in  the  overall  motion? 


Z 


o 


5.  A rigid  straight,  frictionless,  massless,  rod  rotates  about  the  2 axis  at  an  angular  velocity  9.  A mass  m slides 
along  the  frictionless  rod  and  is  attached  to  the  rod  by  a massless  spring  of  spring  constant  K. 
a;  Derive  the  Lagrangian  and  the  Hamiltonian 

b;  Derive  the  equations  of  motion  in  the  stationary  frame  using  Hamiltonian  mechanics, 
c;  What  are  the  constants  of  motion? 

d;  If  the  rotation  is  constrained  to  have  a constant  angular  velocity  9 = U then  is  the  non-cyclic  Routliian 
R-noncyclic  = H — pg9  a constant  of  motion,  and  does  it  equal  the  total  energy? 

e;  Use  the  non-cyclic  Routliian  Rnoncycli c to  derive  the  radial  equation  of  motion  in  the  rotating  frame  of  reference 
for  the  cranked  system  with  9 = ui. 


/. 
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6.  A thin  uniform  rod  of  length  2 L and  mass  M is  suspended  from  a massless  string  of  length  l tied  to  a nail.  Initially 
the  rod  hangs  vertically.  A weak  horizontal  force  F is  applied  to  the  rod’s  free  end. 

a)  Write  the  Lagrangian  for  this  system. 

b)  For  very  short  times  such  that  all  angles  are  small,  determine  the  angles  that  string  and  the  rod  make  with 
the  vertical.  Start  from  rest  at  t = 0. 

c)  Draw  a diagram  to  illustrate  the  initial  motion  of  the  rod. 


7.  A uniform  ladder  of  mass  M and  length  2 L is  leaning  against  a frictionless  vertical  wall  with  its  feet  on  a 
frictionless  horizontal  floor.  Initially  the  stationary  ladder  is  released  at  an  angle  9 g = 60°  to  the  floor.  Assume 
that  gravitation  field  g = 9.81m/s2  acts  vertically  downward  and  that  the  moment  of  inertia  of  the  ladder  about  its 
midpoint  is  I = ^ML2 . 

a)  Derive  the  Lagrangian 

b)  Derive  the  Hamiltonian 

c)  Explain  if  the  Hamiltonian  is  conserved  and/or  if  it  equals  the  total  energy 

d)  Use  the  Lagrangian  to  derive  the  equations  of  motion 

e)  Derive  the  angle  6 at  which  the  ladder  loses  contact  with  the  vertical  wall? 


8.  The  classical  mechanics  exam  induces  Jacob  to  try  his  hand  at  bungee  jumping.  Assume  Jacob’s  mass  m 
is  suspended  in  a gravitational  field  by  the  bungee  of  unstretched  length  b and  spring  constant  k.  Besides  the 
longitudinal  oscillations  due  to  the  bungee  jump,  Jacob  also  swings  with  plane  pendulum  motion  in  a vertical  plane. 
Use  polar  coordinates  r,  q i>,  neglect  air  drag,  and  assume  that  the  bungee  always  is  under  tension. 

a;  Derive  the  Lagrangian 

b;  Determine  Lagrange’s  equation  of  motion  for  angular  motion  and  identify  by  name  the  forces  contributing  to 
the  angular  motion. 

c;  Determine  Lagrange’s  equation  of  motion  for  radial  oscillation  and  identify  by  name  the  forces  contributing  to 
the  tension  in  the  spring. 

d;  Derive  the  generalized  momenta 

e;  Determine  the  Hamiltonian  and  give  all  of  Hamilton’s  equations  of  motion. 


Chapter  9 


Conservative  two-body  central  forces 


9.1  Introduction 

Conservative  two-body  central  forces  are  of  tremendous  importance  in  physics  because  of  the  pivotal  role  that 
the  Coulomb  and  the  gravitational  forces  play  in  nature.  The  Coulomb  force  plays  a role  in  electrodynamics, 
molecular,  atomic,  and  nuclear  physics,  while  the  gravitational  force  plays  an  analogous  role  in  celestial 
mechanics.  Therefore  this  chapter  focusses  on  the  physics  of  systems  involving  conservative  two-body  central 
forces  because  of  the  importance  and  ubiquity  of  these  conservative  two-body  central  forces  in  nature. 

A conservative  two-body  central  force  has  the  following  three  important  attributes. 

1.  Conservative:  A conservative  force  depends  only  on  the  particle  position,  that  is,  the  force  is  not 
time  dependent.  Moreover  the  work  done  by  the  force  moving  a body  between  any  two  points  1 and  2 
is  path  independent.  Conservative  fields  are  discussed  in  chapter  2.8. 

2.  Two-body:  A two-body  force  between  two  bodies  depends  only  on  the  relative  locations  of  the  two 
interacting  bodies  and  is  not  influenced  by  the  proximity  of  additional  bodies.  For  two-body  forces 
acting  between  n bodies,  the  force  on  body  f is  the  vector  superposition  of  the  two-body  forces  due 
to  the  interactions  with  each  of  the  other  n — 1 bodies.  This  differs  from  three-body  forces  where  the 
force  between  any  two  bodies  is  influenced  by  the  proximity  of  a third  body. 

3.  Central:  A central  force  field  depends  on  the  distance  ri2  from  the  origin  of  the  force  at  point  1,  to 
the  body  location  at  point  2,  and  the  force  is  directed  along  the  line  joining  them,  that  is,  fi2. 

A conservative,  two-body,  central  force  combines  the  above  three  attributes  and  can  be  expressed  as, 

F2i=/(r12)f12  (9.1) 

The  force  field  F2i  has  a magnitude  f(r  12)  that  depends  only  on  the  magnitude  of  the  relative  separation 
vector  ri2  = r2  — ri  between  the  origin  of  the  force  at  point  1 and  point  2 where  the  force  acts,  and  the  force 
is  directed  along  the  line  joining  them,  that  is,  ?i2. 

Chapter  2.8  showed  that  if  a two-body  central  force  is  conservative,  then  it  can  be  written  as  the  gradient 
of  a scalar  potential  energy  U (r)  which  is  a function  of  the  distance  from  the  center  of  the  force  field. 


F2i  = — Vt/(r12)  (9.2) 

As  discussed  in  chapter  2,  the  ability  to  represent  the  conservative  central  force  by  a scalar  function  U(r) 
greatly  simplifies  the  treatment  of  central  forces. 

The  Coulomb  and  gravitational  forces  both  are  true  conservative,  two-body,  central  forces  whereas  the 
nuclear  force  between  nucleons  in  the  nucleus  has  three-body  components.  Two  bodies  interacting  via  a 
two-body  central  force  is  the  simplest  possible  system  to  consider,  but  equation  9.1  is  applicable  equally  for 
n bodies  interacting  via  two-body  central  forces  because  the  superposition  principle  applies  for  two-body 
central  forces.  This  chapter  will  focus  first  on  the  motion  of  two  bodies  interacting  via  conservative  two-body 
central  forces  followed  by  a brief  discussion  of  the  motion  for  n > 2 interacting  bodies. 
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9.2  Equivalent  one-body  representation  for  two-body  motion 


The  motion  of  two  bodies,  1 and  2,  interacting  via  two-body 
central  forces,  requires  6 spatial  coordinates,  that  is,  three  each 
for  ri  and  r2-  Since  the  two-body  central  force  only  depends  on 
the  relative  separation  r = iq  — r2  of  the  two  bodies,  it  is  more 
convenient  to  separate  the  6 degrees  of  freedom  into  3 spatial 
coordinates  of  relative  motion  r,  plus  3 spatial  coordinates  for 
the  center-of-mass  location  R as  described  in  chapter  2.7.  It  will 
be  shown  here  that  the  equation  of  motion  for  relative  motion 
of  the  two-bodies  in  the  center  of  mass  can  be  represented  by  an 
equivalent  one-body  problem  which  simplifies  the  mathematics. 

Consider  two  bodies  acted  upon  by  a conservative  two-body 
central  force  where  the  position  vectors  ri  and  r2  specify  the 
location  of  each  particle  as  illustrated  in  figure  9.1.  An  alternate 
set  of  six  variables  would  be  the  three  components  of  the  center 
of  mass  position  vector  R and  the  three  components  specifying 
the  difference  vector  r defined  by  figure  9.1.  Define  the  vectors 
r'j  and  r2  as  the  position  vectors  of  the  masses  mi  and  m-2  with 
respect  to  the  center  of  mass.  Then 

ri  = R + r'j  (9.3) 

r2  = R + r'2 


X 


Figure  9.1:  Center  of  mass  cordinates  for 
the  two-body  system. 


By  the  definition  of  the  center  of  mass 


R = 


m\Ti  + m2r2 

mi  + m-2 


and 

so  that 


Therefore 


mir)  + m2r2  = 0 


ri  — r0  = 


mi  + m2 

1 

m2 


that  is, 


Similarly; 


Substituting  these  into  equation  9.3  gives 


m2 

r 

mi  + m2 


mi 

r 

mi  + m 2 


(9.4) 

(9.5) 

(9.6) 

(9.7) 

(9.8) 

(9.9) 


ri  = R + r)  = Rf r 

mi  + m2 

r2  = R + r2=R — r (9.10) 

mi+m2 

That  is,  the  two  vectors  ri,r2  are  written  in  terms  of  the  position  vector  for  the  center  of  mass  R and  the 
position  vector  r for  relative  motion  in  the  center  of  mass. 

Assuming  that  the  two-body  central  force  is  conservative  and  represented  by  U(r),  then  the  Lagrangian 
of  the  two-body  system  can  be  written  as 


L = i |fi|2  + ^m2  |r2|2  -U(r) 


(9.11) 
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Differentiating  equations  9.10,  with  respect  to  time,  and  inserting  them  into  the  Lagrangian,  gives 

L = ±m\r\2  + ^\r\2-U(r) 


where  the  total  mass  M is  defined  as 


and  the  reduced  mass  p is  defined  by 


or  equivalently 


M = m\  + m2 


mi  + m2 


11  1 

p m\  m2 


The  total  Lagrangian  can  be  separated  into  two  independent  parts 

L = —M  R + Lcm 


where 


Lcm  = |r|2  - U(r ) 


Assuming  that  no  external  forces  are  acting,  then  = 0 and  the  three  Lagrange  equations  for  each  of  the 
three  coordinates  of  the  R coordinate  can  be  written  as 

d d L dPcm  . . 

= —LLA  = 0 (9.18) 

dt  dK  dt  ’ 

That  is,  for  a pure  central  force,  the  center-of-mass  momentum  Pcm  is  a constant  of  motion  where 

81 

P cm.  = — = MR  (9.19) 


It  is  convenient  to  work  in  the  center-of-mass  frame  using 
the  effective  Lagrangian  Lcm.  In  the  center-of-mass  frame  of 

.2  2 t 

reference,  the  translational  kinetic  energy  R associated 

with  center-of-mass  motion  is  ignored,  and  only  the  energy  in  " ~y~  " ^ 

the  center-of-mass  is  considered.  This  center-of-mass  energy  ' - m'  •' 

is  the  energy  involved  in  the  interaction  between  the  colliding 
bodies.  Thus,  in  the  center-of-mass,  the  problem  has  been  re- 
duced to  an  equivalent  one-body  problem  of  a mass  p moving  - 
about  a fixed  force  center  with  a path  given  by  r which  is  the 
separation  vector  between  the  two  bodies,  as  shown  in  figure 

9.2.  In  reality,  both  masses  revolve  around  their  center  of  , , ^ 

mass,  also  called  the  barycenter,  in  the  center-of-mass  frame  __  _ ^ 

as  shown  in  figure  9.2.  Knowing  r allows  the  trajectory  of 
each  mass  about  the  center  of  mass  r(  and  r'2  to  be  calcu-  -2-- 

lated.  Of  course  the  true  path  in  the  laboratory  frame  of 
reference  must  take  into  account  both  the  translational  mo- 
tion of  the  center  of  mass,  in  addition  to  the  motion  of  the  , 

, , , . ,.  , ligure9.2:  Orbits  of  a two- body  system  with 

equivalent  one- body  representation  relative  to  the  barycenter.  0 . „ . , , . 

n r 1 . , j-rr  , , , , , mass  ratio  of  2 rotating  about  the  center-of- 

Be  careful  to  remember  the  difference  between  the  actual  tra-  . . , 

. , . 1 ,1  rr  . • . • . , mass.  O.  1 he  dashed  ellipse  is  the  equivalent 

jectories  of  each  body,  and  the  effective  trajectory  assumed  ’ , , . . , , rr 

, ,,  , , , . , , , ; . , , , one-body  orbit  with  the  center  of  force  at  the 

when  using  the  reduced  mass  which  only  determines  the  ret-  ^ ^ 

alive  separation  r of  the  two  bodies.  This  reduction  to  an 
equivalent  one-body  problem  greatly  simplifies  the  solution 

of  the  motion,  but  it  misrepresents  the  actual  trajectories  and  the  spatial  locations  of  each  mass  in  space. 
The  equivalent  one-body  representation  will  be  used  extensively  throughout  this  chapter. 
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9.3  Angular  momentum  L 


The  notation  used  for  the  angular  momentum  vector  is  L where  the  magnitude  is  designated  by  |L|  = l. 
Be  careful  not  to  confuse  the  angular  momentum  vector  L with  the  Lagrangian  Lcm.  Note  that  the  angular 
momentum  for  two-body  rotation  about  the  center  of  mass  with  angular  velocity  uj  is  identical  when  evaluated 
in  either  the  laboratory  or  equivalent  two-body  representation.  That  is,  using  equations  9.8  and  9.9 

L = m1r/12u>  + ir^r^u;  =pr2uo  (9.20) 


The  center-of-mass  Lagrangian  leads  to  the  following  two  general  properties  regarding  the  angular  mo- 
mentum vector  L. 

1)  The  motion  lies  entirely  in  a plane  perpendicular  to  the  fixed  direction  of  the  total  angular  momentum 
vector.  This  is  because 

Lr=rxpr=0  (9.21) 

that  is,  the  radius  vector  is  in  the  plane  perpendicular  to  the  total  angidar  momentum  vector.  Thus,  it  is 
possible  to  express  the  Lagrangian  in  polar  coordinates,  (r,  ip)  rather  than  spherical  coordinates.  In  polar 
coordinates  the  center-of-mass  Lagrangian  becomes 

LCm  = (r2  + r2ip2^  - U(r)  (9.22) 


2)  If  the  potential  is  spherically  symmetric,  then  the  polar  angle  ip  is  cyclic  and  therefore  Noether’s 
theorem  gives  that  the  angular  momentum  p^,  = L = rxpisa  constant  of  motion.  That  is,  since  dg^n  = 0, 
then  the  Lagrange  equations  imply  that 


d dLcrn 
dt  dip 


(9.23) 


where  the  vectors  p^,  and  ip  imply  that  equation  9.23  refers  to  three  independent  equations  corresponding 
to  the  three  components  of  these  vectors.  Thus  the  angular  momentum,  p$,  conjugate  to  ip,  is  a constant  of 
motion.  The  generalized  momentum  p,t,  is  a first  integral  of  the  motion  which  equals 


P*/> 


dLc 


dip 


= pr2ip  = p4,l 


(9.24) 


where  the  magnitude  of  the  angular  momentum  l,  and  the  direction  p^,,  both  are  constants  of  motion. 

A simple  geometric  interpretation  of  equation  9.24  is  illus- 
trated in  figure  9.3.  The  radius  vector  sweeps  out  an  area  dA 
in  time  dt  where  v 


dA  — -r  x vdt  (9.25) 

and  the  vector  A is  perpendicular  to  the  x — y plane.  The  rate 
of  change  of  area  is 

■ ^ = -r  x v (9.26) 

dt  2 v ' 

But  the  angular  momentum  is 

dA 

L=rxp  = /irxv  = 2p,— — (9.27) 

dt 

Thus  the  conservation  of  angular  momentum  implies  that  the 
areal  velocity  ^4-  also  is  a constant  of  motion.  This  fact  is  called 
Kepler’s  second  law  of  planetary  motion  which  he  deduced  in 
1609  based  on  Tycho  Brahe’s  55  years  of  observational  records 
of  the  motion  of  Mars.  Kepler’s  second  law  implies  that  a 
planet  moves  fastest  when  closest  to  the  sun  and  slowest  when 
farthest  from  the  sun.  Note  that  Kepler’s  second  law  is  a state- 
ment of  the  conservation  of  angular  momentum  which  is  inde- 
pendent of  the  radial  form  of  the  central  potential. 


Figure  9.3:  Area  swept  out  by  the  radius 
vector  in  the  time  dt. 
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9.4  Equations  of  motion 


The  equations  of  motion  for  two  bodies  interacting  via  a conservative  two-body  central  force  can  be  de- 
termined using  the  center  of  mass  Lagrangian,  Lcrn , given  by  equation  9.22.  For  the  radial  coordinate,  the 
operator  equation  A rLcm  = 0 leads  to 


d . • 2 dU 


0 


But 


therefore  the  radial  equation  of  motion  is 


= — 2 
prz 


dU  l2 

P r = + — 3 

or  pr6 


(9.28) 


(9.29) 


(9.30) 


Similarly,  for  the  angular  coordinate,  the  operator  equation  A ^Lcm  = 0 leads  to  equation  9.24.  That  is,  the 
angular  equation  of  motion  for  the  magnitude  of  is 


91  2 / 1 
Pi>  = = Tr  ip  = l 

dip 


(9.31) 


Lagrange’s  equations  have  given  two  equations  of  motion,  one  dependent  on  radius  r and  the  other  on 
the  polar  angle  ip.  Note  that  the  radial  acceleration  is  just  a statement  of  Newton’s  Laws  of  motion  for  the 
radial  force  Fr  in  the  center-of-mass  system  of 


F __dU 

Fr~  dr 

This  can  be  written  in  terms  of  an  effective  potential 

l2 


pr3 


(9.32) 


Ueff(r)  = U(r) 


2 fir2 


which  leads  to  an  equation  of  motion 

dUeff(r) 


Fr  = fjfr  = — - 


dr 


(9.33) 


(9.34) 


Since  -Aj  = prip  , the  second  term  in  equation  (9.33) 
is  the  usual  centrifugal  force  that  originates  because  the 
variable  r is  in  a non-inertial,  rotating  frame  of  reference. 

Note  that  the  angular  equation  of  motion  is  independent 
of  the  radial  dependence  of  the  conservative  two-body 
central  force. 

Figure  9.4  shows,  by  dashed  lines,  the  radial  depen- 
dence of  the  potential  corresponding  to  the  attractive 

inverse  square  law  force,  that  is  U = — -,  and  the  po- 

,2 

tential  corresponding  to  the  centrifugal  term  cor- 
responding to  a repulsive  centrifugal  force.  The  sum  of 
these  two  potentials  Ueff(r),  shown  by  the  solid  line, 
has  a minimum  Um-ln  value  at  a certain  radius  similar 
to  that  manifest  by  the  diatomic  molecule  discussed  in 
example  2.7. 

It  is  remarkable  that  the  six-dimensional  equations 
of  motion,  for  two  bodies  interacting  via  a two-body 

central  force,  has  been  reduced  to  trivial  center-of-mass  translational  motion,  plus  a one- dimensional  one- 
body  problem  given  by  (9.34)  in  terms  of  the  relative  separation  r and  an  effective  potential  Ueff(r). 


Figure  9.4:  The  attractive  inverse-square  law  po- 
tential (£),  the  centrifugal  potential  and 

the  combined  effective  bound  potential. 
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9.5  Differential  orbit  equation: 


The  differential  orbit  equation  relates  the  shape  of  the  orbital  motion,  in  plane  polar  coordinates,  to  the 
radial  dependence  of  the  two-body  central  force.  A Binet  coordinate  transformation,  which  depends  on  the 
functional  form  of  F(r),  can  simplify  the  differential  orbit  equation.  For  the  inverse-square  law  force,  the 
best  Binet  transformed  variable  is  where  u is  defined  to  be 


1 


it  = 


Inserting  the  transformed  variable  u into  equation  9.29  gives 

From  the  definition  of  the  new  variable 


dr  _2du  _2du  ■ l du 
dt  U dt  U dip  n dtp 


Differentiating  again  gives 


l d ( du 


d2r 

dt 2 /it  dt  ^ dtp 

Substituting  these  into  Lagrange’s  radial  equation  of  motion  gives 

d?u  n 1 . 1 . 

dtp  r uA  u 


dru 

df? 


(9.35) 

(9.36) 

(9.37) 

(9.38) 

(9.39) 


Binet’s  differential  orbit  equation  directly  relates  tp  and  r which  determines  the  overall  shape  of  the  orbit 
trajectory.  This  shape  is  crucial  for  understanding  the  orbital  motion  of  two  bodies  interacting  via  a two- 
body  central  force.  Note  that  for  the  special  case  of  an  inverse  square-law  force,  that  is  where  F(-)  = ku 2, 
then  the  right-hand  side  of  equation  9.39  equals  a constant  — jf  since  the  orbital  angular  momentum  is  a 
conserved  quantity. 


9.1  Example:  Central  force  leading  to  a circular  orbit  r = 2R  cos  6 


Binet’s  differential  orbit  equation  can  be  used  to  derive  the 
central  potential  that  leads  to  the  assumed  circidar  trajectory 
of  r = 2 R cos  9 where  R is  the  radius  of  the  circular  orbit. 
Note  that  this  circular  orbit  passes  through  the  origin  of  the 
central  force  when  r = 2R  cos  9 = 0 

Inserting  this  trajectory  into  Binet ’s  differential  orbit  equa- 
tion 9.39  gives 


1 d2  (cos  9)  1 
2R  dt)2 


1 

2R 


(cos  9)  1 


— 7y4f?2  (cos0)2 F(—)  (a) 
r u 


Note  that  the  differential  is  given  by 

d2  ( cos#)-1  d / sin#  \ 2sin20  1 
d92  d9  \ cos3  9 J cos3  9 cos  9 


Inserting  this  differential  into  equation  a gives 


2 sin2  9 1 1 

cos3  9 cos  9 cos  9 


-^8 R3  (cos  9)2F(-) 
P u 


Circular  trajectory  passing  through  the 
origin  of  the  central  force. 


Thus  the  radial  dependence  of  the  required  central  force  is 

F_  l2  2 8 R2l2  1 _ k 

8 R3fi  cos5  9 g r5  r5 


This  corresponds  to  an  attractive  central  force  that  depends  to  the  fifth  power  on  the  inverse  radius  r.  Note 
that  this  example  is  unrealistic  since  the  assumed  orbit  implies  that  the  potential  and  kinetic  energies  are 
infinite  when  r — > 0 at  9 — > | . 
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9.6  Hamiltonian 


Since  the  center-of-mass  Lagrangian  is  not  an  explicit  function  of  time,  then 


dHcrn 

dt 


dLr 


dt 


= 0 


(9.40) 


Thus  the  center-of  mass  Hamiltonian  Hcm  is  a constant  of  motion.  However,  since  the  transformation  to 
center  of  mass  can  be  time  dependent,  then  Hcm  ^ E,  that  is,  it  does  not  include  the  total  energy  because 
the  kinetic  energy  of  the  center-of-mass  motion  has  been  omitted  from  Hcm.  Also,  since  no  transformation 
is  involved,  then 

Hcm  = Tcm  + U = Ecm  (9.41) 

That  is,  the  center-of-mass  Hamiltonian  Hcm  equals  the  center-of-mass  total  energy.  The  center-of-mass 
Hamiltonian  then  can  be  written  using  the  effective  potential  (9.33)  in  the  form 


j j t'r 

HI  c m.  ~ 


K 

2/r 


2/ac 


+ U(r)  = %-  + 


2 /j,  2 jir 


2+U(r)  = fl+UHf(r)  = Ec. 


(9.42) 


It  is  convenient  to  express  the  center-of-mass  Hamiltonian  Hcm  in  terms  of  the  energy  equation  for  the 
orbit  in  a central  field  using  the  transformed  variable  u = Substituting  equations  9.33  and  9.37  into  the 
Hamiltonian  equation  9.42  gives  the  energy  equation  of  the  orbit 


2/r 


+ U (u  1)  — Ecm 


(9.43) 


Energy  conservation  allows  the  Hamiltonian  to  be  used  to  solve  problems  directly.  That  is,  since 

• 2 

y,r 


r2  l2 

tt  _ r-‘  . t 

cm  2 2 ^r2 


+ U(r)  = Ec 


then 


dr 

r = — = ± i 
dt  ' 


Ec  m - U - 


2 fir"1 


The  time  dependence  can  be  obtained  by  integration 

*=  t ±dr 


constant 


4 Erm  - U 


2 jir2 


(9.44) 


(9.45) 


(9.46) 


An  inversion  of  this  gives  the  solution  in  the  standard  form  r = r (t) . However,  it  is  more  interesting  to  find 
the  relation  between  r and  9.  From  relation  9.46  for  then 


dt  = 


±dr 


(9.47) 


while  equation  9.29  gives 


Therefore 


(9.48) 


(9.49) 


which  can  be  used  to  calculate  the  angular  coordinate.  This  gives  the  relation  between  the  radial  and  angular 
coordinates  which  specifies  the  trajectory. 

Although  equations  (9.45)  and  (9.49)  formally  give  the  solution,  the  actual  solution  can  be  derived 
analytically  only  for  certain  specific  forms  of  the  force  law  and  these  solutions  differ  for  attractive  versus 
repulsive  interactions. 
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9.7  General  features  of  the  orbit  solutions 


It  is  useful  to  look  at  the  general  features  of  the  solutions  of  the  equations  of  motion  given  by  the  equivalent 
one-body  representation  of  the  two-body  motion.  These  orbits  depend  on  the  net  center  of  mass  energy  Ecm. 
There  are  five  possible  situations  depending  on  the  center-of-mass  total  energy  Ecrn . 

1)  Ecm>  0 : The  trajectory  is  hyperbolic  and  has  a minimum  distance,  but  no  maximum.  The  distance 

, 2 

of  closest  approach  is  given  when  r = 0.  At  the  turning  point  Ecm  = U+  ^3 

2)  Ecm=  0 : It  can  be  shown  that  the  orbit  for  this  case  is  parabolic. 

3)  0 > Ecm>  Umin  : For  this  case  the  equivalent  orbit  has  both  a maximum  and  minimum  radial  distance 

,2 

at  which  r = 0.  At  the  turning  points  the  radial  kinetic  energy  term  is  zero  so  Ecm  = U+  fppz-  For  the 
attractive  inverse  square  law  force  the  path  is  an  ellipse  with  the  focus  at  the  center  of  attraction  (Figure 
9.5),  which  is  Kepler’s  First  Law.  During  the  time  that  the  radius  ranges  from  rm jn  to  rmax  and  back  the 
radius  vector  turns  through  an  angle  which  is  given  by 


Aip  = 2 


±ldr 


(9.50) 


The  general  path  prescribes  a rosette  shape  which  is  a closed  curve  only  if  A ip  is  a rational  fraction  of 
2t r. 

4)  Ecm=  Umin  : In  this  case  r is  a constant  implying  that  the  path  is  circular  since 


f = 


(9.51) 


5)  Ecm<  Umin  : For  this  case  the  square  root  is  imaginary  and  there  is  no  real  solution. 

In  general  the  orbit  is  not  closed,  and  such  open  orbits  do  not  repeat.  Bertrand’s  Theorem  states  that 
the  inverse-square  central  force,  and  the  linear  harmonic  oscillator,  are  the  only  radial  dependences  of  the 
central  force  that  lead  to  stable  closed  orbits. 


9.2  Example:  Orbit  equation  of  motion  for  a free  body 


It  is  illustrative  to  use  the  differential  orbit  equation  9.39  to  show  that 
a body  in  free  motion  travels  in  a straight  line.  Assume  that  a line  through 
the  origin  O intersects  perpendicular  to  the  instantaneous  trajectory  at  the 
point  Q which  has  polar  coordinates  (ro,f)  relative  to  the  origin.  The 
point  P,  with  polar  coordinates  (r,<p),  lies  on  straight  line  through  Q that 
is  perpendicular  to  OQ  if,  and  only  if,  rcos(cp  — S)  = r$.  Since  the  force  is 
zero  then  the  differential  orbit  equation  simplifies  to 


d?u{(p) 

dcp2 


+ u(cp)  = 0 


A solution  of  this  is 


u(<p)  = — cos  (</>  — 6) 
r 0 


where  ro  and  S are  arbitrary  constants.  This  can  be  rewritten  as 

r 0 


r{p>) 


cos  (</>  — 5) 


y 


This  is  the  equation  of  a straight  line  in  polar  coordinates  as  illustrated  in  the  adjacent  figure.  This  shows 
that  a free  body  moves  in  a straight  line  if  no  forces  are  acting  on  the  body. 


9.8.  INVERSE-SQUARE,  TWO-BODY,  CENTRAL  FORCE 


235 


9.8  Inverse-square,  two-body,  central  force 


The  most  important  conservative,  two-body,  central  interaction  is  the  attractive  inverse-square  law  force, 
which  is  encountered  in  both  gravitational  attraction  and  the  Coulomb  force.  This  force  F(r)  can  be  written 
in  the  form 


_ , , rv  • 

F(r)  = 


(9.52) 


The  force  constant  k is  defined  to  be  negative  for  an  attractive  force  and  positive  for  a repulsive  force.  In 
S.I.  units  the  force  constant  k = —Gmim,2  for  the  gravitational  force  and  k = +|^l  for  the  Coulomb  force. 
Note  that  this  sign  convention  is  the  opposite  of  what  is  used  in  many  books  which  use  a negative  sign  in 
equation  9.52  and  assume  k to  be  positive  for  an  attractive  force  and  negative  for  a repulsive  force. 

The  conservative,  inverse-square,  two-body,  central  force  is  unique  in  that  the  underlying  symmetries 
lead  to  four  conservation  laws,  all  of  which  are  of  pivotal  importance  in  nature. 


1.  Conservation  of  angular  momentum:  Like  all  conservative  central  forces,  the  inverse-square  cen- 
tral two-body  force  conserves  angular  momentum  as  proven  in  chapter  9.3. 

2.  Conservation  of  energy:  This  conservative  central  force  can  be  represented  in  terms  of  a scalar 
potential  energy  U{r)  as  given  by  equation  9.2,  where  for  this  central  force 

U{r)  = - (9.53) 

r 

Moreover,  equation  9.42  showed  that  the  center-of-mass  Hamiltonian  is  conserved,  that  is,  Hcm  = Ecm 

3.  Gauss’  Law:  For  a conservative,  inverse-square,  two-body,  central  force,  the  flux  of  the  force  field  out 
of  any  closed  surface  is  proportional  to  the  algebraic  sum  of  the  sources  and  sinks  of  this  field  that 
are  located  inside  the  closed  surface.  The  net  flux  is  independent  of  the  distribution  of  the  sources 
and  sinks  inside  the  closed  surface,  as  well  as  the  size  and  shape  of  the  closed  surface.  Chapter  2.12.5 
proved  this  for  the  gravitational  force  field. 

4.  Closed  orbits:  Two  bodies  interacting  via  the  conservative,  inverse-square,  two-body,  central  force 
follow  closed  (degenerate)  orbits  as  stated  by  Bertrand’s  Theorem.  The  first  consequence  of  this 
symmetry  is  that  Kepler’s  laws  of  planetary  motion  have  stable,  single-valued  orbits.  The  second 
consequence  of  this  symmetry  is  the  conservation  of  the  eccentricity  vector  discussed  in  chapter  9.84. 

Observables  that  depend  on  Gauss’s  Law,  or  on  closed  planetary  orbits,  are  extremely  sensitive  to  addition 
of  even  a miniscule  incremental  exponent  £ to  the  radial  dependence  ?’~(2±0  0f  the  force.  The  statement 
that  the  inverse-square,  two-body,  central  force  leads  to  closed  orbits  can  be  proven  by  inserting  equation 
9.52  into  the  orbit  differential  equation, 


d2u 

+ u--^  — ku2--^ 

+ r-u 2 ~ r- 

(9.54) 

Using  the  transformation 

uk 

ysu+~p 

(9.55) 

the  orbit  equation  becomes 

d2y 

(9.56) 

A solution  of  this  equation  is 

y = Boos  (ip  - V’o) 

(9.57) 

Therefore 

1 

u = — 
r 

= [1  + ecos  i^~if0)\ 

(9.58) 

This  the  equation  of  a conic  section.  For  an  attractive,  inverse-square,  central  force,  equation  9.58  is  the 
equation  for  an  ellipse  with  the  origin  of  r at  one  of  the  foci  of  the  ellipse  that  has  eccentricity  e,  defined  as 


(9.59) 
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Equation  9.58  is  the  polar  equation  of  a conic  section.  Equation  9.58  also  can  be  derived  with  the  origin 
at  a focus  by  inserting  the  inverse  square  law  potential  into  equation  9.49  which  gives 


= 


/ 


±du 


constant 


2fiEc. 


2 fik 


(9.60) 


u — u* 


The  solution  of  this  gives 


Ilk 


cos  (0  - 0O) 


Equations  9.58  and  9.61  are  identical  if  the  eccentricity  e equals 


(9.61) 


e 


'1  + 


2 Ecml2 
pk2 


(9.62) 


The  value  of  0O  merely  determines  the  orientation  of  the  major  axis  of  the  equivalent  orbit.  Without  loss  of 
generality,  it  is  possible  to  assume  that  the  angle  0 is  measured  with  respect  to  the  major  axis  of  the  orbit, 
that  is  ipQ  = 0.  Then  the  equation  can  be  written  as 


1 


u = - 
r 


fik 


[1  + e cos  (0)] 


fik 


1 + 


2 E l2 
lik2 


cos (VO 


(9.63) 


This  is  the  equation  of  a conic  section  where  e is  the  eccentricity  of  the  conic  section.  The  conic  section  is  a 
hyperbola  if  e > 1,  parabola  if  e = 1,  ellipse  if  e < 1,  and  a circle  if  e = 0.  All  the  equivalent  one-body  orbits 
for  an  attractive  force  have  the  origin  of  the  force  at  a focus  of  the  conic  section.  The  orbits  depend  on 
whether  the  force  is  attractive  or  repulsive,  on  the  conserved  angular  momentum  l,  and  on  the  center-of-mass 
energy  Ecm. 


9.8.1  Bound  orbits 


Closed  bound  orbits  occur  only  if  the  following  requirements 
are  satisfied. 

1.  The  force  must  be  attractive,  (, k < 0)  then  equation 
9.63  ensures  that  r is  positive. 

2.  For  a closed  elliptical  orbit,  the  eccentricity  e < 1 of  the 
equivalent  one-body  representation  of  the  orbit  implies 
that  the  total  center-of-mass  energy  Ecm  < 0,  that  is, 
the  closed  orbit  is  bound. 


Bound  elliptical  orbits  have  the  center-of-force  at  one  in- 
terior focus  Fi  of  the  elliptical  one-body  representation  of  the 
orbit  as  shown  in  figure  9.5. 

The  minimum  value  of  the  orbit  r = rm jn  occurs  when 
V’  = 0,  where 


r- 

lik  [1  + e] 


(9.64) 


This  minimum  distance  is  called  the  periapsis 1 . 


y 


Figure  9.5:  Bound  elliptical  orbit. 


1The  greek  term  apsis  refers  to  the  points  of  greatest  or  least  distance  of  approach  for  an  orbiting  body  from  one  of  the 
foci  of  the  elliptical  orbit.  The  term  periapsis  or  pericenter  both  are  used  to  designate  the  closest  distance  of  approach,  while 
apoapsis  or  apocenter  are  used  to  designate  the  farthest  distance  of  approach.  Attaching  the  terms  "peri-"  and  "apo-"  to  the 
general  term  "-apsis"  is  preferred  over  having  different  names  for  each  object  in  the  solar  system.  For  example,  frequently  used 
terms  are  "-helion"  for  orbits  of  the  sun,  "-gee"  for  orbits  around  the  earth,  and  "-cynthion"  for  orbits  around  the  moon. 
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The  maximum  distance,  r = rmax,  which  is  called  the  apoapsis,  occurs  when  ip  = 180° 


P 

pk  [1  — e] 


(9.65) 


Remember  that  since  k < 0 for  bound  orbits,  the  negative  signs  in  equations  9.64  and  9.65  lead  to  r > 0. 
The  most  bound  orbit  is  a circle  having  e = 0 which  implies  that  Ecm  = — !Ar- . 

The  shape  of  the  elliptical  orbit  also  can  be  described  with  respect  to  the  center  of  the  elliptical  equivalent 
orbit  by  deriving  the  lengths  of  the  semi-major  axis  a and  the  semi-minor  axis  b shown  in  figure  9.5. 


a 

b 


j (T min  + rm 

iV/TT^  = 


0 = 


p,k  [1 


+ 


pk  [1 


pk  [1 


Mx/ir 


(9.66) 

(9.67) 


Remember  that  the  predicted  bound  elliptical  orbit  corresponds  to  the  equivalent  one-body  representation 
for  the  two-body  motion  as  illustrated  in  figure  9.2.  This  can  be  transformed  to  the  individual  spatial 
trajectories  of  the  each  of  the  two  bodies  in  an  inertial  frame. 


9.8.2  Kepler’s  laws  for  bound  planetary  motion 

Kepler’s  three  laws  of  motion  apply  to  the  motion  of  two  bodies  in  a bound  orbit  due  to  the  attractive 
gravitational  force  for  which  k = 

1)  Each  planet  moves  in  an  elliptical  orbit  with  the  sun  at  one  focus 

2)  The  radius  vector,  drawn  from  the  sun  to  a planet,  describes  equal  areas  in  equal  times 

3)  The  square  of  the  period  of  revolution  about  the  sun  is  proportional  to  the  cube  of  the  major  axis 
of  the  orbit. 

Two  bodies  interacting  via  the  gravitational  force,  which  is  a conservative,  inverse-square,  two-body 
central  force,  is  best  handled  using  the  equivalent  orbit  representation.  The  first  and  second  laws  were 
proved  in  chapters  9.8  and  9.3.  That  is,  the  second  law  is  equivalent  to  the  statement  that  the  angular 
momentum  is  conserved.  The  third  law  can  be  derived  using  the  fact  that  the  area  of  an  ellipse  is 


A = nab  = n a2  \/l  — e2  = 


nl  3 
.a2 


\J-JUc 

Equations  9.26  and  9.27  give  that  the  rate  of  change  of  area  swept  out  by  the  radius  vector  is 

dA  _ 1 2 • _ J_ 
dt  2V  2p 


(9.68) 


(9.69) 


Therefore  the  period  for  one  revolution  r is  given  by  the  time  to  sweep  out  one  complete  ellipse 


r = 


(9.70) 


This  leads  to  Kepler’s  3rd  law 

T2=47r2ii_03  (9.71.) 

rv 

Bound  orbits  occur  only  for  attractive  forces  for  which  the  force  constant  k is  negative,  and  thus  cancel 
the  negative  sign  in  equation  9.74.  For  example,  for  the  gravitational  force  k = —Gmim-2- 

Note  that  the  reduced  mass  ji  = ^l11^2  occurs  in  Kepler’s  3rd  law.  That  is,  Kepler’s  third  law  can  be 
written  in  terms  of  the  actual  masses  of  the  bodies  to  be 


T 


2 


4-7T2 

G(m  i +ni2) 


(9.72) 


In  relating  the  relative  periods  of  the  different  planets  Kepler  made  the  approximation  that  the  mass  of  the 
planet  mi  is  negligible  relative  to  the  mass  of  the  sun  m 2- 
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The  eccentricity  of  the  major  planets  ranges  from  e = 0.2056  for  Mercury,  to  e = 0.0068  for  Venus.  The 
Earth  has  an  eccentricity  of  e = 0.0167  with  rm;n  = 91  • 106  miles  and  rmax  = 95  • 106  miles.  On  the  other 
hand,  e = 0.967  for  Halley’s  comet,  that  is,  the  radius  vector  ranges  from  0.6  to  18  times  the  radius  of  the 
orbit  of  the  Earth. 

The  orbit  energy  can  be  derived  by  substituting  the  eccentricity,  given  by  equation  9.62,  into  the  semi- 
major axis  length  a,  given  by  equation  9.66,  which  leads  to  the  center-of-mass  energy  of 


E = 

J-'cm  — o 

2 a 


However,  the  Hamiltonian,  given  by  equation  9.42,  implies  that  Ecm  is 


^ 1 2 
ECm  — 


k_ 

2 a 


For  the  simple  case  of  a circular  orbit,  a = r then  the  velocity  v equals 


(9.73) 


(9.74) 


v = 


(9.75) 


For  a circular  orbit,  the  drag  on  a satellite  lowers  the  total  energy  resulting  in  a decrease  in  the  radius 
of  the  orbit  and  a concomitant  increase  in  velocity.  That  is,  when  the  orbit  radius  is  decreased,  part  of  the 
gain  in  potential  energy  accounts  for  the  work  done  against  the  drag,  and  the  remaining  part  goes  towards 
increase  of  the  kinetic  energy.  Also  note  that,  as  predicted  by  the  Virial  Theorem,  the  kinetic  energy  always 
is  half  the  potential  energy  for  the  inverse  square  law  force. 


9.8.3  Unbound  orbits 


Attractive  inverse-square  central  forces  lead  to  hyperbolic 
orbits  for  e > 1 for  which  Ecm  > 0,  that  is,  the  orbit  is 
unbound.  In  addition,  the  orbits  always  are  unbound  for 
a repulsive  force  since  U = £ is  positive  as  is  the  kinetic 
energy  Tcm,  thus  Ecm  = Tcm  + Ucm  > 0.  The  radial  orbit 
equation  for  either  an  attractive  or  a repulsive  force  is 


— j r-i  /I 

fik  [1  + e cos  ip\ 

For  a repulsive  force  k is  positive  and  l 2 always  is  positive. 
Therefore  to  ensure  that  r remain  positive  the  bracket  term 
must  be  negative.  That  is 

[1  + e cos  ip]  < 0 fc>0  (9.77) 

For  an  attractive  force  k is  negative  and  since  l2  is  positive 
then  the  bracket  term  must  be  positive  to  ensure  that  r is 
positive.  That  is, 


[l  + ecosi/’]>0  k < 0 (9.78) 

Figure  9.6  shows  both  branches  of  the  hyperbola  for  a given 
angle  ip  for  the  equivalent  two-body  orbits  where  the  center 
of  force  is  at  the  origin.  For  an  attractive  force,  k < 0, 
the  center  of  force  is  at  the  interior  focus  of  the  hyperbola, 
whereas  for  a repulsive  force  the  center  of  force  is  at  the 
exterior  focus.  For  a given  value  of  \ip\  the  asymptotes  of  the 
orbits  both  are  displaced  by  the  same  impact  parameter 
b from  parallel  lines  passing  through  the  center  of  force. 
The  scattering  angle,  between  the  outgoing  direction  of  the 
scattered  body  and  the  incident  direction,  is  designated  to 
be  0,  which  is  related  to  the  angle  ip  by  9 = 180°  — 2 ip. 


Figure  9.6:  Hyperbolic  two-body  orbits  for  a 
repulsive  (left)  and  attractive  (right)  inverse- 
square,  central  two-body  forces.  Both  orbits 
have  the  angular  momentum  vector  pointing 
upwards  out  of  the  plane  of  the  orbit 
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9.8.4  Eccentricity  vector 

Two-bodies  interacting  via  a conservative  two-body  central  force  have  two  invariant  first-order  integrals, 
namely  the  conservation  of  energy  and  the  conservation  of  angular  momentum.  For  the  special  case  of  the 
inverse-square  law,  there  is  a third  invariant  of  the  motion,  which  Hamilton  called  the  eccentricity  vector2, 
that  unambiguously  defines  the  orientation  and  direction  of  the  major  axis  of  the  elliptical  orbit.  It  will  be 
shown  that  the  angular  momentum  plus  the  eccentricity  vector  completely  define  the  plane  and  orientation 
of  the  orbit  for  a conservative  inverse-square  law  central  force. 

Newton’s  second  law  for  a central  force  can  be  written  in  the  form 


P =f(r)  r 


(9.79) 


Note  that  the  angular  moment  L = r x p is  conserved  for  a central  force,  that  is  L = 0.  Therefore  the  time 
derivative  of  the  product  p x L reduces  to 


dt 

This  can  be  simplified  using  the  fact  that 


— (p  x L)  = p x L =/(r) rx  (rxpr)  = f(r)—  r (r  ■ r)  — r2 r 


• 1 d r \ 

r r =-  — (r  • r = rr 

2 dtK  } 


thus 


/(r)—  r(r-r)  — r2r  = — /i/(r)r2 


r rr 


= (; 


This  allows  equation  9.80  to  be  reduced  to 


jtipxL)=-^r)r!jt(l 


(9.80) 

(9.81) 

(9.82) 

(9.83) 


Assume  the  special  case  of  the  inverse-square  law,  equation  9.52,  then  the  central  force  equation  9.83  reduces 
to 

4(pxL  )=4w 


dt 


dt 


or 


Define  the  eccentricity  vector  A as 
then  equation  9.85  corresponds  to 


— [(p  x L)  + (ukr)]  = 0 


A = (p  x L)  + {fikr) 

dA 


dt 


= 0 


(9.84) 

(9.85) 

(9.86) 

(9.87) 


This  is  a statement  that  the  eccentricity  vector  A is  a constant  of  motion  for  an  inverse- square,  central 
force. 

The  definition  of  the  eccentricity  vector  A and  angular  momentum  vector  L implies  a zero  scalar  product, 


A ■ L =0 


(9.88) 


Thus  the  eccentricity  vector  A and  angular  momentum  L are  mutually  perpendicular,  that  is,  A is  in  the 
plane  of  the  orbit  while  L is  perpendicular  to  the  plane  of  the  orbit.  The  eccentricity  vector  A,  always  points 
along  the  major  axis  of  the  ellipse  from  the  focus  to  the  periapsis  as  illustrated  on  the  left  side  in  figure  9.7. 

2 The  symmetry  underlying  the  eccentricity  vector  is  less  intuitive  than  the  energy  or  angular  momentum  invariants  leading 
to  it  being  discovered  independently  several  times  during  the  past  three  centuries.  Jakob  Hermann  was  the  first  to  indentify 
this  invariant  for  the  special  case  of  the  inverse-square  central  force.  Bernoulli  generalized  his  proof  in  1710.  Laplace  derived 
the  invariant  at  the  end  of  the  18th  century  using  analytical  mechanics.  Hamilton  derived  the  connection  between  the  invariant 
and  the  orbit  eccentricity.  Gibbs  derived  the  invariant  using  vector  analysis.  Runge  published  the  Gibb’s  derivation  in  his 
textbook  which  was  referenced  by  Lenz  in  a 1924  paper  on  the  quantal  model  of  the  hydrogen  atom.  Goldstein  named  this 
invariant  the  "Laplace-Runge-Lenz  vector",  while  others  have  named  it  the  "Runge-Lenz  vector"  or  the  "Lenz  vector".  This 
book  uses  Hamilton’s  more  intuitive  name  of  "eccentricity  vector". 
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Figure  9.7:  The  elliptical  trajectory  and  eccentricity  vector  A for  two  bodies  interacting  via  the  inverse- 
square,  central  force  for  eccentricity  e = 0.75.  The  left  plot  shows  the  elliptical  spatial  trajectory  where 
the  semi-major  axis  is  assumed  to  be  on  the  x-axis  and  the  angular  momentum  L =Zz,  is  out  of  the  page. 
The  force  centre  is  at  one  foci  of  the  ellipse.  The  vector  coupling  relation  A = (p  x L)  + (/Lifer)  is  illustrated 
at  four  points  on  the  spatial  trajectory.  The  right  plot  is  a hodograph  of  the  linear  momentum  p for  this 
trajectory.  The  periapsis  is  denoted  by  the  number  1 and  the  apoapsis  is  marked  as  3 on  both  plots.  Note 
that  the  eccentricity  vector  A is  a constant  that  points  parallel  to  the  major  axis  towards  the  perapsis. 


As  a consequence,  the  two  orthogonal  vectors  A and  L completely  define  the  plane  of  the  orbit,  plus  the 
orientation  of  the  major  axis  of  the  Kepler  orbit,  in  this  plane.  The  three  vectors  A,  p x L,  and  (/Lifer)  obey 
the  triangle  rule  as  illustrated  in  the  left  side  of  figure  9.7. 

Hamilton  noted  the  direct  connection  between  the  eccentricity  vector  A and  the  eccentricity  e of  the 
conic  section  orbit.  This  can  be  shown  by  considering  the  scalar  product 

A • r =Ar  cos  f = r (pxL)f  /ikr  (9.89) 


Note  that  the  triple  scalar  product  can  be  permuted  to  give 


r-  (p  x L)  = (r  x p)  L = L • L =l2 


(9.90) 


Inserting  equation  9.90  into  9.89  gives 


1 

r 


(9.91) 


Note  that  equations  9.63  and  9.91  are  identical  if  ip0  = 0.  This  implies  that  the  eccentricity  e and 
related  by 

A 

lik 


A are 
(9.92) 


where  k is  defined  to  be  negative  for  an  attractive  force.  The  relation  between  the  eccentricity  and  total 
center-of-mass  energy  can  be  used  to  rewrite  equation  9.62  in  the  form 


A2  = n2  k2  + 2/l iEcml2 


(9.93) 


The  combination  of  the  eccentricity  vector  A and  the  angular  momentum  vector  L completely  specifies 
the  orbit  for  an  inverse  square-law  central  force.  The  trajectory  is  in  the  plane  perpendicular  to  the  angu- 
lar momentum  vector  L,  while  the  eccentricity,  plus  the  orientation  of  the  orbit,  both  are  defined  by  the 
eccentricity  vector  A.  The  eccentricity  vector  and  angular  momentum  vector  each  have  three  independent 
coordinates,  that  is,  these  two  vector  invariants  provide  six  constraints,  while  the  scalar  invariant  energy  E, 
adds  one  additional  constraint.  The  exact  location  of  the  particle  moving  along  the  trajectory  is  not  defined 
and  thus  there  are  only  five  independent  coordinates  governed  by  the  above  seven  constraints.  Thus  the 
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eccentricity  vector,  angular  momentum,  and  center-of-mass  energy  are  related  by  the  two  equations  9.88  and 
9.93. 

Noether’s  theorem  states  that  each  conservation  law  is  a manifestation  of  an  underlying  symmetry. 
Identification  of  the  underlying  symmetry  responsible  for  the  conservation  of  the  eccentricity  vector  A is 
elucidated  using  equation  9.86  to  give 


(pkr)  = A—  (p  x L) 


(9.94) 


Take  the  scalar  product 

( i-bkr ) • ( pkr ) = (pk)2  = p2L 2 + A2  — 2L  ■ (p  x L)  (9.95) 

Choose  the  angular  momentum  to  be  along  the  z-axis,  that  is,  L =lz,  and,  since  p and  A are  perpendicular 
to  L,  then  p and  A are  in  the  x — y plane.  Assume  that  the  semimajor  axis  of  the  elliptical  orbit  is  along 
the  x-axis,  then  the  locus  of  the  momentum  vector  on  a momentum  hodograph  has  the  equation 


(9.96) 


with  the  center 


Equation  9.96  implies  that  the  locus  of  the  momentum  vector  is  a circle  of  radius 

displaced  from  the  origin  at  coordinates  (0,  as  shown  by  the  momentum  hodograph  on  the  right  side  of 
an  figure  9.7.  The  angle  (3  and  eccentricity  e are  related  by, 


cos/3  = 


A 


A/L 
pk/L  pk 


(9.97) 


The  circular  orbit  is  centered  at  the  origin  for  e = — =0,  and  thus  the  magnitude  |p|  is  a constant  around 

the  whole  trajectory. 

The  inverse-square,  central,  two-body,  force  is  unusual  in  that  it  leads  to  stable  closed  bound  orbits 
because  the  radial  and  angular  frequencies  are  degenerate,  i.e.  uir  = lu^.  In  momentum  space,  the  locus  of 
the  linear  momentum  vector  p is  a perfect  circle  which  is  the  underlying  symmetry  responsible  for  both  the 
fact  that  the  orbits  are  closed,  and  the  invariance  of  the  eccentricity  vector.  Mathematically  this  symmetry 
for  the  Kepler  problem  corresponds  to  the  body  moving  freely  on  the  boundary  of  a four-dimensional  sphere 
in  space  and  momentum.  The  invariance  of  the  eccentricity  vector  is  a manifestation  of  the  special  property 
of  the  inverse-square,  central  force  under  certain  rotations  in  this  four-dimensional  space;  this  0(4)  symmetry 
is  an  example  of  a hidden  symmetry. 


9.9  Isotropic,  linear,  two-body,  central  force 

Closed  orbits  occur  for  the  two-dimensional  linear  oscillator  when  — is  a rational  fraction  as  discussed  in 

iOy 

chapter  3.3.  Bertrand’s  Theorem  states  that  the  linear  oscillator,  and  the  inverse-square  law  (Kepler 
problem),  are  the  only  two-body  central  forces  that  have  single-valued,  stable,  closed  orbits  of  the  coupled 
radial  and  angular  motion.  The  invariance  of  the  eccentricity  vector  was  the  underlying  symmetry  leading 
to  single- valued,  stable,  closed  orbits  for  the  Kepler  problem.  It  is  interesting  to  explore  the  symmetry  that 
leads  to  stable  closed  orbits  for  the  harmonic  oscillator.  For  simplicity,  this  discussion  will  restrict  discussion 
to  the  isotropic,  harmonic,  two-body,  central  force  where  ujx  = ujy  = oj,  for  which  the  two-body,  central  force 
is  linear 

F(r)  = kr  (9.98) 

where  k > 0 corresponds  to  a repulsive  force  and  k < 0 to  an  attractive  force.  This  isotropic  harmonic  force 
can  be  expressed  in  terms  of  a spherical  potential  U (r)  where 

U(r)  = ——  kr2  (9.99) 

Since  this  is  a central  two-body  force,  both  the  equivalent  one-body  representation,  and  the  conservation  of 
angular  momentum,  are  equally  applicable  to  the  harmonic  two-body  force.  As  discussed  in  section  9.3,  since 
the  two-body  force  is  central,  the  motion  is  confined  to  a plane,  and  thus  the  Lagrangian  can  be  expressed 
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in  polar  coordinates.  In  addition,  since  the  force  is  spherically  symmetric,  then  the  angular  momentum  is 
conserved.  The  orbit  solutions  are  conic  sections  as  described  in  chapter  9.7.  The  shape  of  the  orbit  for 
the  harmonic  two-body  central  force  can  be  derived  using  either  polar  or  cartesian  coordinates  as  illustrated 
below. 

9.9.1  Polar  coordinates 

The  origin  of  the  equivalent  orbit  for  the  harmonic  force  will  be  found  to  be  at  the  center  of  an  ellipse,  rather 
than  the  foci  of  the  ellipse  as  found  for  the  inverse  square  law.  The  shape  of  the  orbit  can  be  defined  using 
a Binet  differential  orbit  equation  that  employs  the  transformation 


Then 


The  chain  rule  gives  that 


dv! 

dip 


2 dr 
r3  dip 


dr  ■ r3  ■ du'  r p p,  du' 

dip  2 dip  2 pi  dip 

Substitute  this  into  the  Hamiltonian  Hcm,  equation  9.42,  gives 

= =E-^u'  + — 

2 8 u'pi  \dip ) 2pi  2 u’ 


Rearranging  this  equation  gives 


du'\  ,2  8 Epi  , _ 4 kpi 

# J P ip  P ip 


Addition  of  a constant  to  both  sides  of  the  equation  completes  the  square 


dip 


Epi 

4", 


l 2 


+ 4(«'-^)  = +++4 


Epi 

A t 


(9.100) 

(9.101) 

(9.102) 

(9.103) 

(9.104) 

(9.105) 


The  right-hand  side  of  equation  9.105  is  a constant.  The  solution  of  9.105  must  be  a sine  or  cosine  function 
with  polar  angle  ip  = uit.  That  is 


u — 


Epi 


That  is, 


u’  = L = Et 

r2  p% 


Epi 
P%  , 


1+1  + 


kpi 

P% 


kp%' 

E2pi 


cos  2 (ip  — ip0) 


(9.106) 


cos  2 (ip  — ip0 ) 


(9.107) 


Equation  9.107  corresponds  to  a closed  orbit  centered  at  the  origin  of  the  elliptical  orbit  as  illustrated  in 
figure  9.8.  The  eccentricity  e of  this  closed  orbit  is  given  by 


1 + 


kp%' 

E2pi 


(9.108) 


Equations  9.66, 9.67  give  that  the  eccentricity  is  related  to  the  semi-major  a and  semi-minor  b axes  by 

2 


t = i — l - 
a 


(9.109) 


Note  that  for  a repulsive  force  k > 0,  then  e > 1 leading  to  unbound  hyperbolic  or  parabolic  orbits  centered 
on  the  origin.  An  attractive  force,  k < 0,  allows  for  bound  elliptical,  as  well  as  unbound  parabolic  and 
hyperbolic  orbits. 
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Figure  9.8:  The  elliptical  equivalent  trajectory  for  two  bodies  interacting  via  the  linear,  central  force  for 
eccentricity  e = 0.75.  The  left  plot  shows  the  elliptical  spatial  trajectory  where  the  semi-major  axis  is 
assumed  to  be  on  the  x-axis  and  the  angular  momentum  L =/z,  is  out  of  the  page.  The  force  center  is  at 
the  center  of  the  ellipse.  The  right  plot  is  a hodograph  of  the  linear  momentum  p for  this  trajectory. 


9.9.2  Cartesian  coordinates 


The  isotropic  harmonic  oscillator,  expressed  in  terms  of  cartesian  coordinates  in  the  (. x , y)  plane  of  the  orbit, 
is  separable  because  there  is  no  direct  coupling  term  between  the  x and  y motion.  That  is.  the  center-of-mass 
Lagrangian  in  the  (x,  y)  plane  separates  into  independent  motion  for  x and  y. 


L = -ux  ■ r H — fcr  ■ r = 
2 2 


1 • 2 , 1 7 2 

-LLX  H KX 

2 2 


+ 


:PV 


-ky2 
2 y 


(9.110) 


Solutions  for  the  independent  coordinates,  and  their  corresponding  momenta,  are 


r = L4cos  (cot  + a)  + jBcos  (ut  + (3)  (9.111) 

p = —lAfioj  sin  (cot  + a)  — sin  (cot  + (3)  (9.112) 

where  co  = \ -■  Therefore 

y M 

r2  = x2  + y2  = [dcos  (cot  + a)]2  + (Bcos  (cot  + /?)]"  (9.113) 

A2  + B2  JAa  + BA  + 2 AB2  cos  (a  - 0)  , , . 

= j + ^ 2 " — cos  (2ut  + ip0) 


where 


cos  V’o 


A2  cos  a + B2  cos  /? 
,/d4  + f?4  + 2 AB2  cos  (a 


(9.114) 


For  a phase  difference  a — (3  = ±1),  this  equation  describes  an  ellipse  centered  at  the  origin  which  agrees 
with  equation  9.107  that  was  derived  using  polar  coordinates. 

The  two  normal  modes  of  the  isotropic  harmonic  oscillator  are  degenerate,  therefore  x,  y are  equally  good 
normal  modes  with  two  corresponding  total  energies,  E\,E2,  while  the  corresponding  angular  momentum  J 
points  in  the  z direction. 


Ex 

= 

2 n 2 

(9.115) 

E2 

P2v  , 1 . 2 

= wCAy 

(9.116) 

J 

= M (xpy  - VPx ) 

(9.117) 
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Figure  9.8  shows  the  closed  elliptical  equivalent  orbit  plus  the  corresponding  momentum  hoclograph  for 
the  isotropic  harmonic  two-body  central  force.  Figures  9.7  and  9.8  contrast  the  differences  between  the 
elliptical  orbits  for  the  inverse-square  force,  and  those  for  the  harmonic  two-body  central  force.  Although 
the  orbits  for  bound  systems  with  the  harmonic  two-body  force,  and  the  inverse-square  force,  both  lead  to 
elliptical  bound  orbits,  there  are  important  differences.  Both  the  radial  motion  and  momentum  are  two 
valued  per  cycle  for  the  reflection-symmetric  harmonic  oscillator,  whereas  the  radius  and  momentum  have 
only  one  maximum  and  one  minimum  per  revolution  for  the  inverse-square  law.  Although  the  inverse-square, 
and  the  isotropic,  harmonic,  two-body  central  forces  both  lead  to  closed  bound  elliptical  orbits  for  which  the 
angular  momentum  is  conserved  and  the  orbits  are  planar,  there  is  another  important  difference  between  the 
orbits  for  these  two  interactions.  The  orbit  equation  for  the  Kepler  problem  is  expressed  with  respect  to  a 
foci  of  the  elliptical  equivalent  orbit,  as  illustrated  in  figure  9.7,  whereas  the  orbit  equation  for  the  isotropic 
harmonic  oscillator  orbit  is  expressed  with  respect  to  the  center  of  the  ellipse  as  illustrated  in  figure  9.8. 


9.9.3  Symmetry  tensor  A' 

The  invariant  vectors  L and  A provide  a complete  specification  of  the  geometry  of  the  bound  orbits  for 
the  inverse  square-law  Kepler  system.  It  is  interesting  to  search  for  a similar  invariant  that  fully  specifies 
the  orbits  for  the  isotropic  harmonic  central  force.  In  contrast  to  the  Kepler  problem,  the  harmonic  force 
center  is  at  the  center  of  the  elliptical  orbit,  and  the  orbit  is  reflection  symmetric  with  the  radial  and  angular 
frequencies  related  by  uir  = 2u>^.  Since  the  orbit  is  reflection-symmetric,  the  orientation  of  the  major  axis 
of  the  orbit  cannot  be  uniquely  specified  by  a vector.  Therefore,  for  the  harmonic  interaction  it  is  necessary 
to  specify  the  orientation  of  the  principal  axis  by  the  symmetry  tensor.  The  symmetry  of  the  isotropic 
harmonic,  two-body,  central  force  leads  to  the  symmetry  tensor  A',  which  is  an  invariant  of  the  motion 
analogous  to  the  eccentricity  vector  A.  Like  a rotation  matrix,  the  symmetry  tensor  defines  the  orientation, 
but  not  direction,  of  the  major  principal  axis  of  the  elliptical  orbit.  In  the  plane  of  the  polar  orbit  the  3x3 
symmetry  tensor  A'  reduces  to  a 2 x 2 matrix  having  matrix  elements  defined  to  be, 

Aij  = ^ + \kxiXj  (9.118) 


The  diagonal  matrix  elements  A'-n  = E\,  and  A-12  = E 2 which  are  constants  of  motion.  The  off-diagonal 
term  is  given  by 


72  ( PxPy  1 


A'1  = 
Avi  — 


2/i 


+ 2 kxv  ) = 


- 4/z  ( xpy  - ypx ) = E1E2  - 


kJ 2 

4 pfi 


(9.119) 


The  terms  on  the  right-hand  side  of  equation  9.119  all  are  constants  of  motion,  therefore  A'f2  also  is  a 
constant  of  motion.  Thus  the  3x3  symmetry  tensor  A'  can  be  reduced  to  a 2 x 2 symmetry  tensor  for  which 
all  the  matrix  elements  are  constants  of  motion,  and  the  trace  of  the  symmetry  tensor  is  equal  to  the  total 
energy. 

In  summary,  the  inverse-square,  and  harmonic  oscillator  two-body  central  interactions  both  lead  to  closed, 
elliptical  equivalent  orbits,  the  plane  of  which  is  perpendicular  to  the  conserved  angular  momentum  vector. 
However,  for  the  inverse-square  force,  the  origin  of  the  equivalent  orbit  is  at  the  focus  of  the  ellipse  and 
c or  = uip,  whereas  the  origin  is  at  the  center  of  the  ellipse  and  ur  = 2 for  the  harmonic  force.  As  a 
consequence,  the  elliptical  orbit  is  reflection  symmetric  for  the  harmonic  force  but  not  for  the  inverse  square 
force.  The  eccentricity  vector  and  symmetry  tensor  both  specify  the  major  axes  of  these  elliptical  orbits, 
the  plane  of  which  are  perpendicular  to  the  angular  momentum  vector.  The  eccentricity  vector,  and  the 
symmetry  tensor,  both  are  directly  related  to  the  eccentricity  of  the  orbit  and  the  total  energy  of  the  two- 
body  system.  Noether’s  theorem  states  that  the  invariance  of  the  eccentricity  vector  and  symmetry  tensor, 
plus  the  corresponding  closed  orbits,  are  manifestations  of  underlying  symmetries.  The  dynamical  SU 3 
symmetry  underlies  the  invariance  of  the  symmetry  tensor,  whereas  the  dynamical  04  symmetry  underlies 
the  invariance  of  the  eccentricity  vector.  These  symmetries  lead  to  stable  closed  elliptical  bound  orbits  only 
for  these  two  specific  two-body  central  forces,  and  not  for  other  two-body  central  forces. 
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9.10  Closed-orbit  stability 


Bertrand’s  theorem  states  that  the  linear  oscillator  and 
the  inverse-square  law  are  the  only  two-body,  central 
forces  for  which  all  bound  orbits  are  single- valued,  and 
stable  closed  orbits.  The  stability  of  closed  orbits  can 
be  illustrated  by  studying  their  response  to  perturba- 
tions. For  simplicity,  the  following  discussion  of  stabil- 
ity will  focus  on  circular  orbits,  but  the  general  prin- 
ciples are  the  same  for  elliptical  orbits. 

A circular  orbit  occurs  whenever  the  attractive 
force  just  balances  the  effective  ’’centrifugal  force”  in 
the  rotating  frame.  This  can  occur  for  any  radial  func- 
tional form  for  the  central  force.  The  effective  poten- 
tial, equation  9.33  will  have  a stationary  point  when 


that  is,  when 


= 0 

r=ro 


(9.120) 


(9.121) 


This  is  equivalent  to  the  statement  that  the  net  force 
is  zero.  Since  the  central  attractive  force  is  given  by 


F(r)  = - 


dU. 


eff 


dr 


(9.122) 


then  the  stationary  point  occurs  when 


D . 2 

F(r  0)  = 3 = -/zr0i/>  (9.123) 

Tro 


This  is  the  so-called  centrifugal  force  in  the  rotating 
frame.  The  Hamiltonian,  equation  9.44,  gives  that 


r = ± 


(9.124) 


For  a circular  orbit  r = 0 that  is 


Errn.  — U 


2/Li?’2 


(9.125) 


A stable  circular  orbit  is  possible  if  both  equations 
(9.121)  and  (9.125)  are  satisfied.  Such  a circular  or- 
bit will  be  a stable  orbit  at  the  minimum  when 


Figure  9.9:  Stable  and  unstable  effective  central  po- 
tentials. The  repulsive  centrifugal  and  the  attractive 
potentials  (k<0)  are  shown  dashed.  The  solid  curve 
is  the  effective  potential. 


<*2Ueff 

dr2 


> 0 

r=r0 


(9.126) 


Examples  of  stable  and  unstable  orbits  are  shown  in 
figure  9.9. 

Stability  of  a circular  orbit  requires  that 
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which  can  be  written  in  terms  of  the  central  force  for  a stable  orbit  as 


(dF  , 

(lh  1 + 


3F(r0) 

ro 


> 0 


If  the  attractive  central  force  can  be  expressed  as  a power  law 

F(r)  = — krn 


(9.128) 


(9.129) 


then  stability  requires 

kr^-1  (3  + n)  > 0 (9.130) 

or 

n > -3  (9.131) 

Stable  equivalent  orbits  will  undergo  oscillations  about  the  stable  orbit  if  perturbed.  To  first  order,  the 
restoring  force  on  a bound  reduced  mass  /./,  is  given  by 


Frestore  = ~ (r  “ A))  = ^ (9.132) 

To  the  extent  that  this  linear  restoring  force  dominates  over  higher-order  terms,  then  a perturbation  of  the 
stable  orbit  will  undergo  simple  harmonic  oscillations  about  the  stable  orbit  with  angular  frequency 


uj  = 


(9.133) 


The  above  discussion  shows  that  a small  amplitude  radial  oscillation  about  the  stable  orbit  with  amplitude 
£ will  be  of  the  form 

£ = Asin(27rwt  + S) 

The  orbit  will  be  closed  if  the  product  of  the  oscillation  frequency  oj,  and  the  orbit  period  r is  an  integer 
value. 

The  fact  that  planetary  orbits  in  the  gravitational  field  are  observed  to  be  closed  is  strong  evidence 
that  the  gravitational  force  field  must  obey  the  inverse  square  law.  Actually  there  are  small  precessions  of 
planetary  orbits  due  to  perturbations  of  the  gravitational  field  by  bodies  other  than  the  sun,  and  due  to 
relativistic  effects.  Also  the  gravitational  field  near  the  earth  departs  slightly  from  the  inverse  square  law 
because  the  earth  is  not  a perfect  sphere,  and  the  field  does  not  have  perfect  spherical  symmetry.  The  study 
of  the  precession  of  satellites  around  the  earth  has  been  used  to  determine  the  oblate  quadrupole  and  slight 
octupole  (pear  shape)  distortion  of  the  shape  of  the  earth. 

The  most  famous  test  of  the  inverse  square  law  for  gravitation  is  the  precession  of  the  perihelion  of 
Mercury.  If  the  attractive  force  experienced  by  Mercury  is  of  the  form 


XT'/  \ ^,msrnTO  „ 

FM  = ~GW2wrr 


where  |a|  is  small,  then  it  can  be  shown  that,  for  approximate  circular  orbitals,  the  perihelion  will  advance 
by  a small  angle  7ro;  per  orbit  period.  That  is,  the  precession  is  zero  if  a = 0,  corresponding  to  an  inverse 
square  law  dependence  which  agrees  with  Bertrand’s  theorem.  The  position  of  the  perihelion  of  Mercury  has 
been  measured  with  great  accuracy  showing  that,  after  correcting  for  all  known  perturbations,  the  perihelion 
advances  by  43(±5)  seconds  of  arc  per  century,  that  is  5 x 10-'  radians  per  revolution.  This  corresponds  to 
a = 1.6  x 10“ ' which  is  small  but  still  significant.  This  precession  remained  a puzzle  for  many  years  until 
1915  when  Einstein  predicted  that  one  consequence  of  his  general  theory  of  relativity  is  that  the  planetary 
orbit  of  Mercury  should  precess  at  43  seconds  of  arc  per  century,  which  is  in  remarkable  agreement  with 
observations. 
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9.3  Example:  Linear  two-body  restoring  force 

The  effective  potential  for  a linear  two-body  restoring  force  F = —kr  is 


Ueff  2 kr  + 2/zr2 


At  the  minimum 


Thus 


dU, 


eff 


dr 


= kr  — 


= 0 


l2 

= [,m 


and 


Jeff 


dr 2 


Q/2 

= — r + fc  = 4fc  > 0 


which  is  a stable  orbit.  Small  perturbations  of  such  a stable  circular  orbit  will  have  an  angular  frequency 


to  = 


\ 


(d2Ueff 

l dr 2 


r—r  o _ 


= 2* 


Note  that  this  is  twice  the  frequency  for  the  planar  harmonic  oscillator  with  the  same  restoring  coefficient. 
This  is  due  to  the  central  repulsion,  the  effective  potential  well  for  this  rotating  oscillator  example  has  about 
half  the  width  for  the  corresponding  planar  harmonic  oscillator.  Note  that  the  kinetic  energy  for  the  rotational 
motion,  which  is  equals  the  potential  energy  \kr2  at  the  minimum  as  predicted  by  the  Virial  Theorem 
for  a linear  two-body  restoring  force. 

9.4  Example:  Inverse  square  law  attractive  force 

The  effective  potential  for  an  inverse  square  law  restoring  force  F = — Ar,  where  k is  assumed  to  be 
positive, 


At  the  minimum 


Thus 


Ueff  = ~ 

(dUeff 
l Or 


l 2 


r 2 yr2 

k_  _ U_ 
r2  yr3 


= 0 


and 


Jeff 


dr 2 


l 2 

ro  = JTk 


3 If 


2k  k 

t > 0 


r=ro  1 — u '0  '0 

which  is  a stable  orbit.  Small  perturbations  about  such  a stable  circular  orbit  will  have  an  angular  frequency 


to  = 


\ 


( d2Ue  ff  A 
V dr'2  ), 

T 


p,k2 

1 ~ 


The  kinetic  energy  for  oscillations  about  this  stable  circular  orbit,  which  is  ffppz,  equals  half  the  magnitude 
of  the  potential  energy  —-at  the  minimum  as  predicted  by  the  Virial  Theorem. 
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9.5  Example:  Attractive  inverse  cubic  central  force 

The  inverse  cubic  force  is  an  interesting  example  to  investigate  the  stability  of  the  orbit  equations.  One 
solution  of  the  inverse  ciMc  central  force,  for  a reduced  mass  p,  is  a spiral  orbit 


That  this  is  true  can  be  shown  by  inserting  this  orbit  into  the  differential  orbit  equation. 
Using  a Binet  transformation  to  the  variable  to  u gives 


u = 


1 _ 1 
r r0 


—ottp 


du 

dip 

d2u 
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— e 
ro 
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a 

ro 


D — octp 


Substituting  these  into  the  differential  equation  of  the  orbit 

d2u 
~chj? 


lz  uz  u 


gives 


That  is 


a 

ro 
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which  is  a central  attractive  inverse  cubic  force. 

The  time  dependence  of  the  spiral  orbit,  can  be  derived  since  the  angular  momentum  gives 


- A ' 


l 


This  can  be  written  as 


Integrating  gives 


pr2  pr'oe2a',lJ 


e2a^dip  = -Tdt 
hro 


e2a^  It 

~ 2 + P 

2a  pr q 


where  /3  is  a constant.  But  the  orbit  gives 


r 2 = r20e2a + = 


2 alt 
h 


+ 2 a/3 


Thus  the  radius  increases  or  decreases  as  the  square  root  of  the  time.  That  is,  an  attractive  cubic  central  force 
does  not  have  a stable  orbit  which  is  what  is  expected  since  there  is  no  minimum  in  the  effective  potential 
energy.  Note  that  it  is  obvious  that  there  will  be  no  minimum  or  maximum  for  the  summation  of  effective 
potential  energy  since,  if  the  force  is  F = — p- , then  the  effective  potential  energy  is 


Ueff  = - — 


l 2 


2 r2  2 pr2 


1 

Tr2 


which  has  no  stable  minimum  or  maximum. 
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9.6  Example:  Spiralling  mass  attached  by  a string  to  a hanging  mass 

An  example  of  an  application  of  orbit  stability  is  the  case  shown  in  the  adjacent  figure.  A particle  of  mass 
m moves  on  a horizontal  frictionless  table.  It  is  attached  by  a light  string  of  fixed  length  b and  rotates  about 
a hole  in  the  table.  The  string  is  attached  to  a second  equal  mass  m that  is  hanging  vertically  downwards 
with  no  angidar  motion. 

The  equations  are  most  conveniently  expressed  in  cylindrical 
coordinates  ( r , 9,  z ) with  the  origin  at  the  hole  in  the  table,  and  z 
vertically  upward.  The  fixed  length  of  the  string  requires  z = r—b. 

The  potential  energy  is 

U = mgz  = mg(r  — b) 

The  system  is  central  and  conservative,  thus  the  Hamiltonian 
can  be  written  as 

H = ^ ^f2  + r202^  + ^-r2  + mg(r  — b)  = E 

The  Lagrangian  is  independent  of  9,  that  is,  9 is  cyclic,  thus  the 
angular  momentum  mr29  = l is  a constant  of  motion.  Substi- 
tuting this  into  the  Hamiltonian  equation  gives 

l2 

mr 2 H + mg(r  — b)  = E 

2 mrz 

The  effective  potential  is 

which  is  shown  in  the  adjacent  figure.  The  stationary  value  occurs  when 


Rotating  mass  to  on  a frictionless 
horizontal  table  connected  to  a 
suspended  mass  to. 


dU, 


eff 


dr 


mrx 


+ mg  = 0 


That  is,  when  the  angular  momentum  is  related  to  the  radius  by 


l2  = m2grl 


Note  that  ro  = 0 if  l = 0. 

The  stability  of  the  solution  is  given  by  the  second  deriv- 
d2Ueff\  3 12  _ 3 mg 

dr2  ) ro  mr$  r0 

Therefore  the  stationary  point  is  stable. 

Note  that  the  equation  of  motion  for  the  minimum  can  be 
expressed  in  terms  of  the  restoring  force  on  the  two  masses 

2mf  = -(^r^  (r-ro) 

Thus  the  system  undergoes  harmonic  oscillation  with  fre- 
quency 

3 mg 
ro 

2 TO 

The  solution  of  this  system  is  stable  and  undergoes  simple 
harmonic  motion. 
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9.11  The  three-body  problem 

Two  bodies  interacting  via  conservative  central  forces  can  be 
solved  analytically  for  the  inverse  square  law  and  the  Hooke’s 
law  radial  dependences  as  already  discussed.  For  central  forces 
having  other  radial  dependences  the  equations  of  motion  may 
not  be  expressible  in  terms  of  simple  functions,  nevertheless  the 
motion  always  can  be  given  in  terms  of  an  integral.  For  a gravi- 
tational system  comprising  n > 3 bodies  that  are  interacting  via 
the  two-body  central  gravitational  force,  then  the  equations  of 
motion  can  be  written  as 

= GEm3mf=r^ — ^4  (j  = 1)  2, ..,  n) 

k lqfc-q.il 

fc#J 

Even  when  all  the  n bodies  are  interacting  via  two-body  central 
forces,  the  problem  usually  is  insoluble  in  terms  of  known  ana- 
lytic integrals.  Newton  first  posed  the  difficulty  of  the  three-body 
Kepler  problem  which  has  been  studied  extensively  by  mathe- 
maticians and  physicists.  No  known  general  analytic  integral 
solution  has  been  found.  Each  body  for  the  n-body  system  has 
6 degrees  of  freedom,  that  is,  3 for  position  and  3 for  momen- 
tum. The  center-of-mass  motion  can  be  factored  out,  therefore 
the  center-of-mass  system  for  the  n-body  system  has  6n—  10  de- 
grees of  freedom  after  subtraction  of  3 degrees  for  location  of  the 
center  of  mass,  3 for  the  linear  momentum  of  the  center  of  mass, 

3 for  rotation  of  the  center  of  mass,  and  1 for  the  total  energy  of 
the  system.  Thus  for  n = 2 there  are  12  — 10  = 2 degrees  of  freedom  for  the  two-body  system  for  which  the 
Kepler  approach  takes  to  be  r and  9.  For  n = 3 there  are  8 degrees  of  freedom  in  the  center  of  mass  system 
that  have  to  be  determined. 

Numerical  solutions  to  the  three-body  problem  can  be  obtained  using  successive  approximation  or  per- 
turbation methods  in  computer  calculations.  The  problem  can  be  simplified  by  restricting  the  motion  to 
either  of  following  two  approximations: 

1)  Planar  approximation 

This  approximation  assumes  that  the  three  masses  move  in  the  same  plane,  that  is,  the  number  of  degrees 
of  freedom  are  reduced  from  8 to  6 which  simplifies  the  numerical  solution. 

2)  Restricted  three-body  approximation 

The  restricted  three-body  approximation  assumes  that  two  of  the  masses  are  large  and  bound  while  the 
third  mass  is  negligible  such  that  the  perturbation  of  the  motion  of  the  larger  two  by  the  third  body  is 
negligible.  Thus  approximation  essentially  reduces  the  system  to  a two  body  problem  in  order  to  calculate 
the  gravitational  fields  that  act  on  the  third  much  lighter  mass. 

Euler  and  Lagrange  showed  that  the  restricted  three-body  system  has  five  points  at  which  the  combined 
gravitational  attraction  plus  centripetal  force  of  the  two  large  bodies  cancel.  These  are  called  the  Lagrange 
points  and  are  used  for  parking  satellites  in  stable  orbits  with  respect  to  the  Earth-Moon  system,  or  with 
respect  to  the  Sun-Earth  system.  Figure  9.10  illustrates  the  five  Lagrange  points  for  the  Earth-Sun  system. 
Only  two  of  the  Lagrange  points,  L4  and  L5  lead  to  stable  orbits.  Note  that  these  Lagrange  points  are  fixed 
with  respect  to  the  Earth-Sun  system  which  rotates  with  respect  to  inertial  coordinate  frames.  The  1900’s 
discovery  of  the  Trojan  asteroids  at  the  L4  and  L5  Lagrange  points  of  the  Sun-Jupiter  system  confirmed  the 
Lagrange  predictions. 

Poincare  showed  that  the  motion  of  a light  mass  bound  to  two  heavy  bodies  can  exhibit  extreme  sensitivity 
to  initial  conditions  as  well  as  characteristics  of  chaos.  Solution  of  the  three-body  problem  has  remained  a 
largely  unsolved  problem  since  Newton  discovered  the  difficulties  involved. 


Figure  9.10:  A contour  plot  of  the  effec- 
tive potential  for  the  Sun-Eartli  gravita- 
tional system  in  the  rotating  frame  where 
the  Sun  and  Earth  are  stationary.  The 
5 Lagrange  points  Lj  are  saddle  points 
where  the  net  force  is  zero.  (Figure  cre- 
ated by  NASA) 
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9.12  Two-body  scattering 

Two  moving  bodies  interacting  via  a central  force  scatter  when  the  force  is  repulsive,  or  when  an  attractive 
system  is  unbound.  Two-body  scattering  of  bodies  is  encountered  extensively  in  the  fields  of  astronomy, 
atomic,  nuclear,  and  particle  physics.  The  probability  of  such  scattering  is  most  conveniently  expressed  in 
terms  of  scattering  cross  sections  defined  below. 

9.12.1  Total  two-body  scattering  cross  section 

The  concept  of  scattering  cross  section  for  two-body  scat- 
tering is  most  easily  described  for  the  total  two-body  cross 
section.  The  probability  P that  a beam  of  rig  incident  point 
particles/second,  distributed  over  a cross  sectional  area  Ab, 
will  hit  a single  solid  object,  having  a cross  sectional  area  <j, 
is  given  by  the  ratio  of  the  areas  as  illustrated  in  figure  9.11. 

That  is, 


where  it  is  assumed  that  Ab  » cr.  For  a spherical  target 
body  of  radius  r,  the  cross  section  a = nr2.  The  scattering 

probability  P is  proportional  to  the  cross  section  a which  Figure  9.11:  Scattering  probability  for  an  in- 
is the  cross  section  of  the  target  body  perpendicular  to  the  cident  beam  of  cross  sectional  area  A by  a 
beam;  thus  a has  the  units  of  area.  target  body  of  cross  sectional  area  cr. 

Since  the  incident  beam  of  incident  point  parti- 
cles/second, has  a cross  sectional  area  Ab,  then  it  will  have 
an  areal  density  I given  by 

I = beam  particles /m2 / sec  (9.135) 

Ab 

then  the  number  of  beam  particles  scattered  per  second  Ng  by  this  single  target  scatterer  equals 

Ng  = Pns  = -j—IAb  = crl  (9.136) 

Ab 

Thus  the  cross  section  for  scattering  by  this  single  target  body  is 

Ng  Scattered  particles/sec 
I incident  beam/m' /sec 

Realistically  one  will  have  many  target  scatterers  in  the  target  and  the  total  scattering  probability  increases 
proportionally  to  the  number  of  target  scatterers.  That  is,  for  a target  comprising  an  areal  density  of 
target  bodies  per  unit  area  of  the  incident  beam,  then  the  number  scattered  will  increase  proportional  to  the 
target  areal  density  r]T.  That  is,  there  will  be  t]tAb  scattering  bodies  that  interact  with  the  beam  assuming 
that  the  target  has  a larger  area  than  the  beam.  Thus  the  total  number  scattered  per  second  Ng  by  a target 
that  comprises  multiple  scatterers  is 

Ns  = cr  1-^-rjTAB  = anBr]T  (9.137) 

Ab 

Note  that  this  is  independent  of  the  cross  sectional  area  of  the  beam  assuming  that  the  target  area  is  larger 
than  that  of  the  beam.  That  is,  the  number  scattered  per  second  is  proportional  to  the  cross  section  a times 
the  product  of  the  number  of  incident  particles  per  second,  ns,  and  the  areal  density  of  target  scatterers, 
??T.  Typical  cross  sections  encountered  in  astrophysics  are  cr  « 1014m2,  in  atomic  physics:  a « 10_20m2, 
and  in  nuclear  physics;  cr  ~ 10_28m2  = barns.3 

N.  B.,  the  above  proof  assumed  that  the  target  size  is  larger  than  the  cross  sectional  area  of  the  incident 
beam.  If  the  size  of  the  target  is  smaller  than  the  beam,  then  hb  is  replaced  by  the  areal  density/sec  of  the 
beam  riB  and  r]T  is  replaced  by  the  number  of  target  particles  ut  and  the  cross-sectional  size  of  the  target 
cancels. 

:iThe  term  "barn"  was  chosen  because  nuclear  physicists  joked  that  the  cross  sections  for  neutron  scattering  by  nuclei  were 
as  large  as  a barn  door. 
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9.12.2  Differential  two-body  scattering  cross  section 


The  differential  two-body  scattering  cross  section  gives  much 
more  detailed  information  of  the  scattering  force  than  does 
the  total  cross  section  because  of  the  correlation  between  the 
impact  parameter  and  the  scattering  angle.  That  is,  a mea- 
surement of  the  number  of  beam  particles  scattered  into  a 
given  solid  angle  as  a function  of  scattering  angles  9,  <j>  probes 
the  radial  form  of  the  scattering  force. 

The  differential  cross  section  for  scattering  of  an  incident 
beam  by  a single  target  body  into  a solid  angle  dfl  at  scat- 
tering angles  9,  <j>  is  defined  to  be 


da 

dfl 


m = 


ldNs  {&,<!>) 
I dfl 


(9.138) 


where  the  right-hand  side  is  the  ratio  of  the  number  scattered 
per  target  nucleus  into  solid  angle  dfl(9,  <f>)  to  the  incident 
beam  intensity  I particles /m2  / sec. 

Similar  reasoning  used  to  derive  equation  9.137  leads  to 
the  number  of  beam  particles  scattered  into  a solid  angle 
dfl  for  riB  beam  particles  incident  upon  a target  with  areal 
density  pT  is 

dNs{9,4>)  da 

~m — (9-U9) 

Consider  the  equivalent  one-body  system  for  scattering  of  one  body  by  a scattering  force  center  in  the 
center  of  mass.  As  shown  in  figures  9.6  and  9.12,  the  perpendicular  distance  between  the  center  of  force  of  the 
two  body  system  and  trajectory  of  the  incoming  body  at  infinite  distance  is  called  the  impact  parameter  b.  For 
a central  force  the  scattering  system  has  cylindrical  symmetry,  therefore  the  solid  angle  dfl(9(p)  = sin  9d9d<p 
can  be  integrated  over  the  azimuthal  angle  <j>  to  give  dfl{9)  = 27rsin  9d9. 

For  the  inverse-square,  two-body,  central  force  there  is  a one-to-one  correspondence  between  impact 
parameter  b and  scattering  angle  9 for  a given  bombarding  energy.  In  this  case,  assuming  conservation  of 
flux  means  that  the  incident  beam  particles  passing  through  the  impact-parameter  annulus  between  b and 
b + db  must  equal  the  the  number  passing  between  the  corresponding  angles  9 and  9 + d9.  That  is,  for  an 
incident  beam  flux  of  I particles / m2 / sec  the  number  of  particles  per  second  passing  through  the  annulus  is 


Figure  9.12:  The  equivalent  one-bocly  prob- 
lem for  scattering  of  a reduced  mass  p by  a 
force  centre  in  the  centre  of  mass  system. 


I2irb \db\  = 277^1  sin0  \d0\ 

The  modulus  is  used  to  ensure  that  the  number  of  particles  is  always  positive.  Thus 


da  b db 
dfl  sin  9 d9 


(9.140) 


(9.141) 


9.12.3  Impact  parameter  dependence  on  scattering  angle 

If  the  function  b = f{9,Ecm)  is  known,  then  it  is  possible  to  evaluate  which  can  be  used  in  equation 
9.141  to  calculate  the  differential  cross  section.  A simple  and  important  case  to  consider  is  two-body  elastic 
scattering  for  the  inverse-square  law  force  such  as  the  Coulomb  or  gravitational  forces.  To  avoid  confusion, 
in  the  following  discussion  the  center-of-mass  scattering  angle  will  be  called  9 , while  the  angle  used  to  define 
the  hyperbolic  orbits  in  the  discussion  of  trajectories  for  the  inverse  square  law,  will  be  called  xp. 

In  chapter  9.8  the  equivalent  one-body  representation  gave  that  the  radial  distance  for  a trajectory  for 
the  inverse  square  law  is  given  by 

- = [1  + ecosif)]  (9.142) 

r r 

Note  that  closest  approach  is  when  ip  = 0 while  for  r — > oo  the  bracket  must  equal  zero,  that  is 

1 


COsV’oo  = ± 


e 


(9.143) 
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The  polar  angle  ip  is  measured  with  respect  to  the  symmetry  axis  of  the  two-body  system  which  is  along 
the  line  of  distance  of  closest  approach  as  shown  in  figure  9.6.  The  geometry  and  symmetry  show  that  the 
scattering  angle  0 is  related  to  the  trajectory  angle  ipx  by 


d = n-  Zipoo 

Equation  9.50  gives  that 


Since 

l2  = b2p 2 = b22pEcm 

then  the  scattering  angle  can  be  written  as. 


Let  u = 1,  then 


For  the  repulsive  inverse  square  law 

17  = --  = -ku 
r 

where  k is  taken  to  be  positive  for  a repulsive  force.  Thus  the  scattering  angle  relation  becomes 


(9.144) 

(9.145) 

(9.146) 

(9.147) 

(9.148) 

(9.149) 

(9.150) 


The  solution  of  this  equation  is  given  by  equation  9.63  to  be 


Therefore 

‘2Ecmb  _ ^1  e2  _ i _ cot  (9.154) 

k 2 

that  is,  the  impact  parameter  b is  given  by  the  relation 


b = 


k 

2 Ecm 


(9.155) 


Figure  9.13:  Impact  parameter  depen- 
dence on  scattering  angle  for  Rutherford 
scattering. 


Thus,  for  an  inverse-square  law  force,  the  two-body  scattering 
has  a one-to-one  correspondence  between  impact  parameter  b 
and  scattering  angle  9 as  shown  schematically  in  figure  9.13. 
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If  k is  negative,  which  corresponds  to  an  attractive  inverse  square 
law,  then  one  gets  the  same  relation  between  impact  parameter  and 
scattering  angle  except  that  the  sign  of  the  impact  parameter  b is 
opposite.  This  means  that  the  hyperbolic  trajectory  has  an  interior 
rather  than  exterior  focus.  That  is,  the  trajectory  partially  orbits 
around  the  center  of  force  rather  than  being  repelled  away. 

Note  that  the  distance  of  closest  approach  is  related  to  the 
eccentricity  e by  equation  9.151,  therefore 

rmin  = w^—  (1  + e)  (9.156) 


I 


2EC 


1 + 


sin ; 


Note  that  for  9 = 180°  then 


E — 


= U{rB 


a) 


(9.157) 


(9.158) 


which  is  what  you  would  expect  from  equating  the  incident  kinetic 
energy  to  the  potential  energy  at  the  distance  of  closest  approach. 

For  scattering  of  two  nuclei  by  the  normal  repulsive  Coulomb  force, 
when  the  impact  parameter  becomes  small  enough,  the  attractive  nu- 
clear force  also  acts  leading  to  impact-parameter  dependent  effective 
potentials  illustrated  in  figure  9.14.  Trajectory  1 does  not  overlap  the 
nuclear  force  and  thus  is  pure  Coulomb.  Trajectory  2 interacts  at  the 

periphery  of  the  nuclear  potential  and  the  trajectory  deviates  from  pure  Coulomb  shown  dashed.  Trajectory 
3 passes  through  the  interior  of  the  nuclear  potential.  These  three  trajectories  all  can  lead  to  the  same  scat- 
tering angle  and  thus  there  no  longer  is  a one-to-one  correspondence  between  scattering  angle  and  impact 
parameter. 


Figure  9.14:  Classical  trajectories  for 
scattering  to  a given  angle  by  the 
repulsive  Coulomb  field  plus  the  at- 
tractive nuclear  field  for  three  differ- 
ent impact  parameters.  Path  1 is 
pure  Coulomb.  Paths  2 and  3 in- 
clude Coulomb  plus  nuclear  interac- 
tions. The  dashed  parts  of  trajecto- 
ries 2 and  3 correspond  to  only  the 
Coulomb  force  acting,  i.e.  zero  nu- 
clear force 


9.12.4  Rutherford  scattering 

Two  models  of  the  nucleus  evolved  in  the  1900’s,  the  Rutherford  model  assumed  electrons  orbiting  around  a 
small  nucleus  like  planets  around  the  sun,  while  J.J.  Thomson’s  ” plum-pudding”  model  assumed  the  electrons 
were  embedded  in  a uniform  sphere  of  positive  charge  the  size  of  the  atom.  When  Rutherford  derived  his 
classical  formula  in  1911  he  realized  that  it  can  be  used  to  determine  the  size  of  the  nucleus  since  the  electric 
field  obeys  the  inverse  square  law  only  when  outside  of  the  charged  spherical  nucleus.  Inside  a uniform  sphere 
of  charge  the  electric  field  is  E cc  r and  thus  the  scattering  cross  section  will  not  obey  the  Rutherford  relation 
for  distances  of  closest  approach  that  are  less  than  the  radius  of  the  sphere  of  negative  charge.  Observation 
of  the  angle  beyond  which  the  Rutherford  formula  breaks  down  immediately  determines  the  radius  of  the 
nucleus. 

For  pure  Coulomb  scattering,  equation  9.155  can  be  used  to  evaluate  | ^||  , which  when  used  in  equation 
9.141,  gives  the  center-of-mass  Rutherford  scattering  cross  section 


da 

dfl 


1 

4 


(9.159) 


This  cross  section  assumes  elastic  scattering  by  a repulsive  two-body  inverse- square  central  force.  For  scat- 
tering of  nuclei  in  the  Coulomb  potential,  the  constant  k is  given  to  be 


/,  _ ZpZtc2 
4tt£0 


(9.160) 


The  cross  section,  scattering  angle  and  Ecrn  of  equation  9.159  are  in  the  center-of-mass  coordinate  system, 
whereas  usually  two-body  elastic  scattering  data  involve  scattering  of  the  projectiles  by  a stationary  target. 
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Gieger  and  Marsden  performed  scattering  of  7.7  MeV  a particles  from  a thin  gold  foil  and  proved  that 
the  differential  scattering  cross  section  obeyed  the  Rutherford  formula  back  to  angles  corresponding  to  a 
distance  of  closest  approach  of  10~14m  which  is  much  smaller  that  the  10~lom  size  of  the  atom.  This 
validated  the  Rutherford  model  of  the  atom  and  immediately  led  to  the  Bohr  model  of  the  atom  which 
played  such  a crucial  role  in  the  development  of  quantum  mechanics.  Bohr  showed  that  the  agreement  with 
the  Rutherford  formula  implies  the  Coulomb  held  obeys  the  inverse  square  law  to  small  distances.  This  work 
was  performed  at  Manchester  University,  England  between  1908  and  1913.  It  is  fortunate  that  the  classical 
result  is  identical  to  the  quantal  cross  section  for  scattering,  otherwise  the  development  of  modern  physics 
could  have  been  delayed  for  many  years. 

Scattering  of  very  heavy  ions,  such  as  208Pb,  can  electromagnetically  excite  target  nuclei.  For  the  Coulomb 
force  the  impact  parameter  b and  the  distance  of  closest  approach,  rm in  are  directly  related  to  the  scattering 
angle  9 by  equation  9.155.  Thus  observing  the  angle  of  the  scattered  projectile  unambiguously  determines  the 
hyperbolic  trajectory  and  thus  the  electromagnetic  impulse  given  to  the  colliding  nuclei.  This  process,  called 
Coulomb  excitation,  uses  the  measured  angular  distribution  of  the  scattered  ions  for  inelastic  excitation  of 
the  nuclei  to  precisely  and  unambiguously  determine  the  Coulomb  excitation  cross  section  as  a function  of 
impact  parameter.  This  unambiguously  determines  the  shape  of  the  nuclear  charge  distribution. 


9.7  Example:  Two-body  scattering  by  an  inverse  cubic  force 

Assume  two-body  scattering  by  a potential  U = \ where  k > 0.  This  corresponds  to  a repulsive  two-body 
force  F =^|r.  Insert  this  force  into  Binet’s  differential  orbit,  equation  9.39,  gives 

dlu+u(1  + zht)=0 


l 2 


The  solution  is  of  the  form  u = Asm(coip  + fd)  where  A and  fd  are  constants  of  integration,  l = gr2tf,  and 


w2  = 1 + 


2kg 


Initially  r = oo,  u = 0,  and  therefore  (d  = 0.  Also  at  r = oo,  E = \gr%D  , that  is  jr^l  = y Then 

dr  ■ dr  l l du  . I , , . 

r = —if  = — — j = — = -A-ucos{wip) 

dip  dip  grz  gdip  g 

The  initial  energy  gives  that  A = j^\/2 gE.  Hence  the  orbit  equation  is 


1 VWE  . , n 

u = - = — sin  ww 

r Ilo 

The  above  trajectory  has  a distance  of  closest  approach,  rmin,  when  ipm in  = fp-  Moreover,  due  to  the 


symmetry  of  the  orbit,  the  scattering  angle  9 is  given  by 


9 = 7T  — 2lp0  = 7T  ( 1 — 


1 


w 


Since  l 2 = g2b2r^x>  = 2 62gE  then 


l--=  I 1 

7T 


2 kg 

— 


= 1 + 


— V 

b2E 


This  gives  that  the  impact  parameter  b is  related  to  scattering  angle  by 

2 k (t t-9)2 


E (27 T -9)9 

This  impact  parameter  relation  can  be  used  in  equation  9.141  to  give  the  differential  cross  section 


da 


dQ  sin  9 


i r2  (7 r — 9) 


Esin9  (27 t - 9)2  92 


These  orbits  are  called  Cotes  spirals. 
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9.13  Two-body  kinematics 

So  far  the  discussion  has  been  restricted  to  the  center-of-momentum  system.  Practically,  scattering  mea- 
surements are  performed  in  the  laboratory  frame,  and  thus  it  is  necessary  to  transform  the  scattering  angle, 
energies  and  cross  sections  between  the  laboratory  and  center-of-momentum  coordinate  frame.  In  principle 
the  transformation  between  the  center-of-momentum  and  laboratory  frames  is  straightforward,  one  only  has 
to  use  vector  addition  of  the  center-of-mass  velocity  vector  and  the  center-of-momentum  velocity  vectors  of 
the  two  bodies.  The  following  discussion  assumes  non-relativistic  kinematics  apply. 

In  chapter  2.7.2  it  was  shown  that,  for  Newtonian  mechanics,  the  center-of-mass  and  center-of-momentum 
frames  of  reference  are  identical.  By  definition,  in  the  center-of-momentum  frame  the  vector  sum  of  the  linear 
momentum  of  the  incoming  projectile,  pI™ltlal  and  target,  pIrpltlal  are  equal  and  opposite.  That  is 

^Initial  + plnitial  = q (9.161) 


Using  the  center-of-momentum  frame,  coupled  with  the  conservation  of  linear  momentum,  implies  that  the 
vector  sum  of  the  final  momenta  of  the  N reaction  products,  pfmal,  also  is  zero.  That  is 


N 

Epf“'  = o 

i=l 


(9.162) 


An  additional  constraint  is  that  energy  conservation  relates  the  initial  and  final  kinetic  energies  by 


(P 


Initial \ 

P ) 


{p 


.Initial^ 
T ) 


2mp 


2 tot 


Q = 


(P 


Final\ 

P ) 


(t 


Final\ 
T ) 


2m  p 


2 mp 


(9.163) 


where  the  Q value  is  the  energy  contributed  to  the  final  total  kinetic  energy  by  the  reaction  between  the 
incoming  projectile  and  target.  For  exothermic  reactions,  Q > 0,  the  summed  kinetic  of  the  reaction  products 
exceeds  the  sum  of  the  incoming  kinetic  energies,  while  for  endothermic  reactions,  Q < 0,  the  summed  kinetic 
energy  of  the  reaction  products  is  less  than  that  of  the  incoming  channel. 

For  two-body  kinematics,  the  following  are  three  advantages  to  working  in  the  center-of-momentum  frame 
of  reference. 


1.  Two  incident  colliding  bodies  are  colinear  as  are  two  final  bodies. 

2.  The  linear  momenta  for  the  two  colliding  bodies  are  identical  in  both  the  incident  channel  and  also  the 
outgoing  channel. 

3.  The  total  energy  in  the  center-of-momentum  coordinate  frame  is  the  energy  available  to  the  reac- 
tion during  the  collision.  The  trivial  kinetic  energy  of  the  center-of-momentum  frame  relative  to  the 
laboratory  frame  is  handled  separately. 

The  kinematics  for  two-body  reactions  is  easily  determined  using  the  conservation  of  linear  momentum 
along  and  perpendicular  to  the  beam  direction  plus  the  conservation  of  energy,  9.161  — 9.163.  Note  that  it  is 
common  practice  to  use  the  name  center-of-mass  rather  than  center-of-momentum  in  spite  of  the  fact  that 
for  relativistic  mechanics  only  the  center-of-momentum  is  a meaningful  concept. 

General  features  of  the  transformation  between  the  center-of-momentum  and  laboratory  frames  of  refer- 
ence are  best  illustrated  by  elastic  or  inelastic  scattering  of  nuclei  where  the  two  reaction  products  in  the  final 
channel  are  identical  to  the  incident  bodies.  Inelastic  excitation  of  an  excited  state  energy  of  A Eex  in  either 
reaction  product  corresponds  to  Q = —A Eexc,  while  elastic  scattering  corresponds  to  Q = —A Eexc  = 0. 

For  inelastic  scattering  the  conservation  of  linear  momenta  for  the  outgoing  channel  in  the  center-of- 
momentum  simplifies  to 

ppnal  + p^inal  = 0 (9.164) 

that  is,  the  linear  momenta  of  the  two  reaction  products  are  equal  and  opposite. 

Assume  that  the  center-of-momentum  direction  of  the  scattered  projectile  is  at  an  angle  = d relative 
to  the  direction  of  the  incoming  projectile  direction  and  the  scattered  target  nucleus  is  scattered  at  a center- 
of-momentum  direction  =7 r — i9.  Elastic  scattering  corresponds  to  simple  scattering  for  which  the 

magnitudes  of  the  incoming  and  outgoing  projectile  momenta  are  equal,  that  is,  |ppmai|  = | . 
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Figure  9.15:  Vector  hoclograph  of  the  scattered  projectile  and  target  velocities  for  a projectile,  with  incident 
velocity  V{ , that  is  elastically  scattered  by  a stationary  target  body.  The  circles  show  the  magnitude  of 
the  projectile  and  target  body  final  velocities  in  the  center  of  mass.  The  center-of-mass  velocity  vectors 
are  shown  as  dashed  lines  while  the  laboratory  vectors  are  shown  as  solid  lines.  The  left  liodograph  shows 
normal  kinematics  where  the  projectile  mass  is  less  than  the  target  mass.  The  right  liodograph  shows  inverse 
kinematics  where  the  projectile  mass  is  greater  than  the  target  mass.  For  elastic  scattering  Up  = u'T. 


Velocities 

The  transformation  between  the  center-of-momentum  and  laboratory  frames  requires  knowledge  of  the  par- 
ticle velocities  which  can  be  derived  from  the  linear  momenta  since  the  particle  masses  are  known.  Assume 
that  a projectile,  mass  mp,  with  incident  energy  Ep  in  the  laboratory  frame  bombards  a stationary  target 
with  mass  mp-  The  incident  projectile  velocity  n,;  is  given  by 


Vi  = 


The  initial  velocities  in  the  laboratory  frame  are  taken  to  be 


(9.165) 


wp  = Vi  (Initial  Lab  velocities) 

Wt  = 0 

The  final  velocities  in  the  laboratory  frame  after  the  inelastic  collision  are 

w'p  (Final  Lab  velocities) 

Wp 

In  the  center-of-momentum  coordinate  system,  equation  9.10  implies  that  the  initial  center-of-momentum 
velocities  are 


mp 

Up  = Vi 

mp  + nriT 
mp 

ut  = Vi 

mp  + mp 


(9.166) 


It  is  simple  to  derive  that  the  final  center-of-momentum  velocities  after  the  inelastic  collision  are  given 
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by 


tot 


Up 


mp  + Trip  V mp 
mp 


-E 


mp  + mp  V mp 


E 


(9.167) 


The  energy  E is  defined  to  be  given  by 


£ = £P  + Q(1  + — ) 

mp 


(9.168) 


where  Q = —A E which  is  the  excitation  energy  of  the  final  excited  states  in  the  outgoing  channel. 


Angles 

The  angles  of  the  scattered  recoils  are  written  as 


0 


p 

lab 


9t 

' lab 


and 


0 

0 


p 

cm 

T 

cm 


0 

7T  — 0 


(Final  laboratory  angles) 


(Final  CM  angles) 


where  0 is  the  center-of-nrass  (center-of-momentum)  scattering  angle. 

From  figure  9.15  it  can  be  seen  that  angle  relations  between  the  laboratory  and  CM  frames  for  the 
scattered  projectile  are  connected  by 


sin(0fm 

sin 


mp 

mp 


where 


mp  1 mp  1 

mp  / 1 i Q ! 1 i mp\  mp  /l  | Q (mP+mT 
V Ep  1 ' mp ' y Ep /mp  t mpmp 


and  is  the  energy  per  nucleon  on  the  incident  projectile. 
Equation  9.169  can  be  rewritten  as 


(9.169) 


(9.170) 


tan  9 


p 

lab 


sin^^ 


cos  •dzL  + r 


(9.171) 


Another  useful  relation  from  equation  9.169  gives  the  center-of-momentum  scattering  angle  in  terms  of 
the  laboratory  scattering  angle. 

Cm  = sin_1(Tsin6C)  + 9?ab  (9.172) 

This  gives  the  difference  in  angle  between  the  lab  scattering  angle  and  the  center-of-momentum  scattering 
angle.  Be  careful  with  this  relation  since  Cb  is  two-valued  for  inverse  kinematics  corresponding  to  the  two 
possible  signs  for  the  solution. 

The  angle  relations  between  the  lab  and  center-of-momentum  for  the  recoiling  target  nucleus  are  connected 
by 


That  is 


si n(Cm  - Ofab)  = llE=f 

sin  6fab  V E 

Cm  = suC1  (r  sin  dfab)  + 6jab 


(9.173) 

(9.174) 
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894  MeV  2lwPb  on  101Pd 


Figure  9.16:  The  kinematic  correlation  of  the  laboratory  and  center-of-mass  scattering  angles  of  the  recoiling 
projectile  and  target  nuclei  for  scattering  for  4. 3MeV /nucleon  104Pd  on  208Pb  (left)  and  for  the  inverse 
4.3 MeV /nucleon  208Pb  on  104Pd  (right).  The  projectile  scattering  angles  are  shown  by  solid  lines  while  the 
recoiling  target  angles  are  shown  by  dashed  lines.  The  blue  curves  correspond  to  elastic  scattering,  that  is 
Q = 0,  while  the  red  curves  correspond  to  inelastic  scattering  with  Q = —5 MeV . 


where 

t = = 1 = (9.175) 

n 4.  JLn  4.  EE.1  /l  4-  Q (mP+mT\ 

Y ' Ep  k1  ' rriT  ' Y Ep /mp  ' mpiriT  ' 

Note  that  f is  the  same  under  interchange  of  the  two  nuclei  at  the  same  incident  energy/nucleon,  and 
that  r is  always  larger  than  or  equal  to  unity  since  Q is  negative.  For  elastic  scattering  f = 1 which  gives 
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lab 


1 

2 


(-7T  — '!?) 


(Recoil  lab  angle  for  elastic  scattering) 


For  the  target  recoil  equation  9.173  can  be  rewritten  as 


tan^iab  — 


sind 


T 

cm 


cos  dt 


(Target  lab  to  CM  angle  conversion) 


Velocity  vector  hodographs  provide  useful  insight  into  the  behavior  of  the  kinematic  solutions.  As  shown 
in  figure  9.15,  in  the  center-of-momentum  frame  the  scattered  projectile  has  a fixed  final  velocity  u’P,  that  is, 
the  velocity  vector  describes  a circle  as  a function  of  The  vector  addition  of  this  vector  and  the  velocity 
of  the  center-of-mass  vector  —up  gives  the  laboratory  frame  velocity  w’p.  Note  that  for  normal  kinematics, 
where  mp  < mp,  then  \up\  < \u'P\  leading  to  a monotonic  one-to-one  mapping  of  the  center-of-momentum 
angle  Up  and  9fab.  However,  for  inverse  kinematics,  where  mp  > mp,  then  \up\  > \u’P\  leading  to  two  valued 
solutions  at  any  fixed  laboratory  scattering  angle  6. 

Billiard  ball  collisions  are  an  especially  simple  example  where  the  two  masses  are  identical  and  the  collision 
is  essentially  elastic.  Then  essentially  r = f = 1,  0fab  = and  9jab  = ^ (n  — i that  is,  the  angle 
between  the  scattered  billiard  balls  is 

Both  normal  and  inverse  kinematics  are  illustrated  in  figure  9.16  which  shows  the  dependence  of  the 
projectile  and  target  scattering  angles  in  the  laboratory  frame  as  a function  of  center-of-momentum  scattering 
angle  for  the  Coulomb  scattering  of  104Pd  by  208Pb,  that  is,  for  a mass  ratio  of  2 : 1.  Both  normal  and 
inverse  kinematics  are  shown  for  the  same  bombarding  energy  of  I.3MeV/ nucleon  for  elastic  scattering  and 
for  inelastic  scattering  with  a Q-value  of  —5 MeV . 
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447  MeV  10lPd  on  208Pb 


894  MeV  208Pb  on  1(MPd 


Figure  9.17:  Recoil  energies,  in  MeV,  versus  laboratory  scattering  angle,  shown  on  the  left  for  scattering  of 
ITlMeV  104Pd  by  208Pb  with  Q = —5.0 MeV,  and  shown  on  the  right  for  scattering  of  894 MeV  208Pb  on 
104Pd  with  Q = -5.0 MeV. 


Since  sin('d^rn  — Ojab)  < 1 then  equation  9.173  implies  that  f sin  0[ab  < 1.  Since  t is  always  larger  than  or 
equal  to  unity  there  is  a maximum  scattering  angle  in  the  laboratory  frame  for  the  recoiling  target  nucleus 
given  by 

sinflLx  = J (9.1.76) 

For  elastic  scattering  9 ^ab  = sin_1(i)  = 90°  since  r = 1 for  both  894 MeV  208Pb  bombarding  104Pd,  and 
the  inverse  reaction  using  a 447 MeV  104Pd  beam  scattered  by  a 208Pb  target.  A Q-value  of  —5 MeV 
gives  f = 1.002808  which  implies  a maximum  scattering  angle  of  9^ab  = 85.71°  for  both  89IMeV  208Pb 
bombarding  104Pd,  and  the  inverse  reaction  of  a 447M eV  104Pd  beam  scattered  by  a 208Pb  target.  As  a 
consequence  there  are  two  solutions  for  rd^m  for  any  allowed  value  of  6fab  as  illustrated  in  figure  9.16. 

Since  — 8fab ) < 1 then  equation  9.150  implies  that  Tsindfab  < 1.  For  a M7  At  eV  104Pd  beam 

scattered  by  a 208Pb  target  = 0.50,  thus  t = 0.5  for  elastic  scattering  which  implies  that  there  is  no 
upper  bound  to  0fab.  This  leads  to  a one-to-one  correspondence  between  dfab  and  for  normal  kinematics. 
In  contrast,  the  projectile  has  a maximum  scattering  angle  in  the  laboratory  frame  for  inverse  kinematics 
since  = 2.0  leading  to  an  upper  bound  to  9^ab  given  by 

sinCax  = ^ (9.177) 

For  elastic  scattering  t = 2 implying  6^ax  = 30°.  In  addition  to  having  a maximum  value  for  Qfab,  when 
r > 1,  there  also  are  two  solutions  for  for  any  allowed  value  of  9fab.  For  the  example  of  8MMeV  208Pb 
bombarding  178Hf  leads  to  a maximum  projectile  scattering  angle  of  9fab  = 30.0°  for  elastic  scattering  and 
O'i^  = 29.907°  for  Q = —5 MeV. 

Kinetic  energies 

The  initial  total  kinetic  energy  in  the  center-of-momentum  frame  is 

E^nitui  = Ep  — rnr (9 . i 78) 

TTi  p + TTIt 

The  final  total  kinetic  energy  in  the  center-of-momentum  frame  is 

= E1™1™1  +Q  = E — — (9 . 1 79) 

mp  + rriT 
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In  the  laboratory  frame  the  kinetic  energies  of  the  scattered  projectile  and  recoiling  target  nucleus  are 
given  by 

= (yy t^)2(1+t2+2t“s<H  <9-180> 

E^ab  = mpmT  l1  + f2  + 2fcos^r\&  (9-181) 

(top  + tot)  v / 

where  and  are  the  center-of-mass  scattering  angles  respectively  for  the  scattered  projectile  and 
target  nuclei. 

For  the  chosen  incident  energies  the  normal  and  inverse  reactions  give  the  same  center-of-momentum 
energy  of  298MeV  which  is  the  energy  available  to  the  interaction  between  the  colliding  nuclei.  However, 
the  kinetic  energy  of  the  center-of-momentum  is  447—298  = U9MeV  for  normal  kinematics  and  894—298  = 
596MeV  for  inverse  kinematics.  This  trivial  center-of-momentum  kinetic  energy  does  not  contribute  to  the 
reaction.  Note  that  inverse  kinematics  focusses  all  the  scattered  nuclei  into  the  forward  hemisphere  which 
reduces  the  required  solid  angle  for  particle  detection. 


Solid  angles 


The  laboratory-frame  solid  angles  for  the  scattered  projectile  and  target  are  taken  to  be  duip  and  dcop 
respectively,  while  the  center-of-momentum  solid  angles  are  dOp  and  dflp  respectively.  The  Jacobian  relating 
the  solid  angles  is 


dcop 

dQp 
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lab 


sin  di 
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(9.182) 
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(9.183) 


These  can  be  used  to  transform  the  calculated  center-of-momentum  differential  cross  sections  to  the 
laboratory  frame  for  comparison  with  measured  values.  Note  that  relative  to  the  center-of-momentum  frame, 
the  forward  focussing  increases  the  observed  differential  cross  sections  in  the  forward  laboratory  frame  and 
decreases  them  in  the  backward  hemisphere. 


Exploitation  of  two-body  kinematics 

Computing  the  transform  relations  between  the  center-of-mass  and  laboratory  coordinate  frames  is  non- 
trivial and  this  transformation  for  two-body  scattering  is  used  extensively  in  many  fields  of  physics.  This 
discussion  has  assumed  non-relativistic  two-body  kinematics.  Relativistic  two-body  kinematics  encompasses 
non-relativistic  kinematics  as  discussed  in  chapter  16.4.  Many  computer  codes  are  available  that  can  be  used 
for  making  either  non-relativistic  or  relativistic  transformations. 

It  is  stressed  that  the  underlying  physics  for  two  interacting  bodies  is  identical  irrespective  of  whether 
the  reaction  is  observed  in  the  center-of-mass  or  the  laboratory  coordinate  frames.  That  is,  no  new  physics 
is  involved  in  the  kinematic  transformation.  However,  the  transformation  between  these  frames  can  dramat- 
ically alter  the  angles  and  velocities  of  the  observed  scattered  bodies  which  can  be  beneficial  experimentally. 
For  example,  in  heavy-ion  nuclear  physics  the  projectile  and  target  nuclei  can  be  interchanged  leading  to 
very  different  velocities  and  scattering  angles  in  the  laboratory  frame  of  reference  which  can  greatly  facili- 
tate identification  and  observation  of  the  velocities  vectors  of  the  scattered  nuclei.  In  high-energy  physics 
it  is  advantageous  to  collide  beams  having  identical,  but  opposite,  linear  momentum  vectors,  since  then  the 
laboratory  frame  is  the  center-of-mass  frame,  and  the  energy  required  to  accelerate  the  colliding  bodies  is 
minimized. 
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9.14  Summary 

This  chapter  has  focussed  on  the  classical  mechanics  of  bodies  interacting  via  conservative,  two-body,  central 
interactions.  The  following  are  the  main  topics  presented  in  this  chapter. 


Equivalent  one-body  representation  for  two  bodies  interacting  via  a central  interaction  The 

equivalent  one-body  representation  of  the  motion  of  two  bodies  interacting  via  a two-body  central  interaction 
greatly  simplifies  solution  of  the  equations  of  motion.  The  position  vectors  ri  and  r-2  are  expressed  in  terms 
of  the  center-of-mass  vector  R plus  total  mass  M = m\  + m-2  while  the  position  vector  r,  plus  associated 
reduced  mass  u = mi,TO2  , describe  the  relative  motion  of  the  two  bodies  in  the  center  of  mass.  The  total 
Lagrangian  then  separates  into  two  independent  parts 


where  the  center-of-mass  Lagrangian  is 


+ Lc 


L, 


|f|2  -XJ{r) 


(9.16) 

(9.17) 


Equations  9.10,  and  9.11  can  be  used  to  derive  the  actual  spatial  trajectories  of  the  two  bodies  expressed  in 
terms  of  iq  and  r2,  from  the  relative  equations  of  motion,  written  in  terms  of  R and  r,  for  the  equivalent 
one-body  solution.. 


Angular  momentum  Noether’s  theorem  shows  that  the  angular  momentum  is  conserved  if  only  a spherically- 
synnnetric  two-body  central  force  acts  between  the  interacting  two  bodies.  The  plane  of  motion  is  perpen- 
dicular to  the  angular  momentum  vector  and  thus  the  Lagrangian  can  be  expressed  in  polar  coordinates 
as 


Lcm  = {r2  + r +2)  - U(r) 


(9.22) 


Differential  orbit  equation  of  motion  The  Binet  transformation  u = 4 allows  the  center-of-mass 
Lagrangian  Lcm  for  a central  force  F =/(r)r  to  be  used  to  express  the  differential  orbit  equation  for  the 
radial  motion  as 


d 2 


dip 


u 1 1 . 

+ u=  -p-F(-) 
l-  uz  u 


(9.39) 


The  Lagrangian,  and  the  Hamiltonian  all  were  used  to  derive  the  equations  of  motion  for  two  bodies  inter- 
acting via  a two-body,  conservative,  central  interaction.  The  general  features  of  the  conservation  of  angular 
momentum  and  conservation  of  energy  for  a two-body,  central  potential  were  presented. 


Inverse-square,  two-body,  central  force  The  is  of  pivotal  importance  in  nature  since  it  is  applies 
to  both  the  gravitational  force  and  the  Coulomb  force.  The  underlying  symmetries  of  the  inverse-square, 
two-body,  central  interaction,  lead  to  conservation  of  angular  momentum,  conservation  of  energy,  Gauss’s 
law,  and  that  the  two-body  orbits  follow  closed,  degenerate,  orbits  that  are  conic  sections,  for  which  the 
eccentricity  vector  is  conserved.  The  radial  dependence,  relative  to  the  force  center  which  lies  at  one  focus 
of  the  conic  section,  is  given  by 

\ = “7T  [i  + ecosW’-^o)]  (9-58) 


where  the  orbit  eccentricity  e equals 


2 E l2 
fj,k2 


(9.62) 


These  lead  to  Kepler’s  three  laws  of  motion  for  two  bodies  in  a bound  orbit  due  to  the  attractive  gravitational 
force  for  which  k = —Gmirri2-  The  inverse-square  law  is  special  in  that  the  eccentricity  vector  A is  a third 
invariant  of  the  motion,  where 

A = (p  x L)  + (/ikr)  (9.86) 


The  eccentricity  vector  unambiguously  defines  the  orientation  and  direction  of  the  major  axis  of  the  elliptical 
orbit.  The  invariance  of  the  eccentricity  vector,  and  the  existence  of  stable  closed  orbits,  are  manifestations 
of  the  dynamical  04  symmetry. 
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Isotropic,  harmonic,  two-body,  central  force  The  isotropic,  harmonic,  two-body,  central  interaction 
is  of  interest  since,  like  the  inverse-square  law  force,  it  leads  to  closed  elliptical  orbits  described  by 


1 


r 


2 


Efj, 

V% 


cos2('0  — tp0) 


where  the  eccentricity  e is  given  by 


(9.107) 


(9.108) 


The  harmonic  force  orbits  are  distinctly  different  from  those  for  the  inverse-square  law  in  that  the  force  center 
is  at  the  center  of  the  ellipse,  rather  than  at  the  focus  for  the  inverse-square  law  force.  This  elliptical  orbit 
is  reflection  symmetric  for  the  harmonic  force,  but  not  for  the  inverse  square  force.  The  isotropic  harmonic 
two-body  force  leads  to  invariance  of  the  symmetry  tensor,  and  stable  closed  orbits,  which  are  manifestations 
of  the  dynamical  SU3  symmetry. 


Orbit  stability  Bertrand’s  theorem  states  that  only  the  inverse  square  law  and  the  linear  radial  depen- 
dences of  the  central  forces  lead  to  stable  closed  bound  orbits  that  do  not  precess.  These  are  manifestation 
of  the  dynamical  symmetries  that  occur  for  these  two  specific  radial  forms  of  two-body  forces. 

The  three-body  problem  The  difficulties  encountered  in  solving  the  equations  of  motion  for  three  bodies, 
that  are  interacting  via  two-body  central  forces,  was  discussed.  The  three-body  motion  can  include  the 
existence  of  chaotic  motion.  It  was  shown  that  solution  of  the  three-body  problem  is  simplified  if  either  the 
planar  approximation,  or  the  restricted  tlrree-body  approximation,  are  applicable. 


Two-body  scattering  The  total  and  differential  two-body  scattering  cross  sections  were  introduced.  It 
was  shown  that  for  the  inverse-square  law  force  there  is  a simple  relation  between  the  impact  parameter  b 
and  scattering  angle  6 given  by 


b = 


k 

2 E cm 


(9.155) 


This  led  to  the  solution  for  the  differential  scattering  cross-section  for  Rutherford  scattering  due  to  the 
Coulomb  interaction. 


da  1 / k \2  1 

dtt  = 4 \2EZ)  sin4  | 


(9.159) 


This  cross  section  assumes  elastic  scattering  by  a repulsive  two-body  inverse- square  central  force.  For  scat- 
tering of  nuclei  in  the  Coulomb  potential  the  constant  k is  given  to  be 


/,  _ ZpZTe2 
47 T£o 


(9.160) 


Two-body  kinematics  The  transformation  from  the  center-of-momentum  frame  to  laboratory  frames  of 
reference  was  introduced.  Such  transformations  are  used  extensively  in  many  fields  of  physics  for  theoretical 
modelling  of  scattering,  and  for  analysis  of  experiment  data. 
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Workshop  exercises 

1.  Listed  below  are  several  statements  concerning  central  force  motion.  For  each  statement,  give  the  reason  for 
why  the  statement  is  true.  If  a statement  is  only  true  in  certain  situations,  then  explain  when  it  holds  and 
when  it  doesn’t.  The  system  referred  to  below  consists  of  mass  mi  located  at  r\  and  mass  m2  located  at  r 2. 

• The  potential  energy  of  the  system  depends  only  on  the  difference  T\  — r 2,  not  on  n and  r2  separately. 

• The  potential  energy  of  the  system  depends  only  on  the  magnitude  of  r\  — r 2,  not  the  direction. 

• It  is  possible  to  choose  an  inertial  reference  frame  in  which  the  center  of  mass  of  the  system  is  at  rest. 

• The  total  energy  of  the  system  is  conserved. 

• The  total  angular  momentum  of  the  system  is  conserved. 

2.  A particle  of  mass  to  moves  in  a potential  U(r)  = —Uoe~^  r . 

(a)  Given  the  constant  l,  find  an  implicit  equation  for  the  radius  of  the  circular  orbit.  A circular  orbit  at 
r = p is  possible  if 


where  V is  the  effective  potential. 

(b)  What  is  the  largest  value  of  l for  which  a circular  orbit  exists?  What  is  the  value  of  the  effective  potential 
at  this  critical  orbit? 

3.  A particle  of  mass  to  is  observed  to  move  in  a spiral  orbit  given  by  the  equation  r = k6 , where  k is  a constant. 
Is  it  possible  to  have  such  an  orbit  in  a central  force  field?  If  so,  determine  the  form  of  the  force  function. 

4.  The  interaction  energy  between  two  atoms  of  mass  m is  given  bv  the  Lennard-Jones  potential,  U(r)  = 
e [(r0/r)12  - 2(r0/r)6] 

(a)  Determine  the  Lagrangian  of  the  system  where  r\  and  V2  are  the  positions  of  the  first  and  second  mass, 
respectively. 

(b)  Rewrite  the  Lagrangian  as  a one-body  problem  in  which  the  center-of-mass  is  stationary. 

(c)  Determine  the  equilibrium  point  and  show  that  it  is  stable. 

(d)  Determine  the  frequency  of  small  oscillations  about  the  stable  point. 

5.  Consider  two  bodies  of  mass  m in  circular  orbit  of  radius  ro/2,  attracted  to  each  other  by  a force  F(r ) , where 
r is  the  distance  between  the  masses. 

(a)  Determine  the  Lagrangian  of  the  system  in  the  center-of-mass  frame  (Hint:  a one-body  problem  subject 
to  a central  force). 

(b)  Determine  the  angular  momentum.  Is  it  conserved? 

(c)  Determine  the  equation  of  motion  in  r in  terms  of  the  angular  momentum  and  |F(r)|. 


(d)  Expand  your  result  in  (c)  about  an  equilibrium  radius  r 0 and  show  that  the  condition  for  stability 

-2-  > 0 

ro 


F'(r 0) 
S’  F(r0) 


6.  Consider  two  charges  of  equal  magnitude  q connected  by  a spring  of  spring  constant  k'  in  circular  orbit.  Can 
the  charges  oscillate  about  some  equilibrium?  If  so,  what  condition  must  be  satisfied? 

7.  Consider  a mass  to  in  orbit  around  a mass  M,  which  is  subject  to  a force  F = — where  r is  the  distance 
between  the  masses.  Show  that  the  Runge-Lenz  vector  A = p X L — pk  f is  conserved. 
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Problems 

1.  Show  that  the  areal  velocity  is  constant  for  a particle  moving  under  the  influence  of  an  attractive  force  given 
by  F(r)  = — kr.  Calculate  the  time  averages  of  the  kinetic  and  potential  energies  and  compare  with  the  the 
results  of  the  virial  theorem. 

2.  Assume  that  the  Earth’s  orbit  is  circular  and  that  the  Sun’s  mass  suddenly  decreases  by  a factor  of  two.  (a) 
What  orbit  will  the  earth  then  have?  (b)  Will  the  Earth  escape  the  solar  system? 

3.  Discuss  the  motion  of  a particle  in  a central  inverse-square-law  force  field  for  a superimposed  force  whose 
magnitude  is  inversely  proportional  to  the  cube  of  the  distance  from  the  particle  to  force  center;  that  is 

= (k,  A > 0) 

Show  that  the  motion  is  described  by  a precessing  ellipse.  Consider  the  cases 

7 2 i2  j2 

a)  A < b)  A = c)  A > — where  l is  the  angular  momentum  and  fi  the  reduced  mass. 

A4  A4  A4 

4.  A communications  satellite  is  in  a circular  orbit  around  the  earth  at  a radius  R and  velocity  V.  A rocket 
accidentally  fires  quite  suddenly,  giving  the  rocket  an  outward  velocity  V in  addition  to  its  original  tangential 
velocity  v. 

a)  Calculate  the  ratio  of  the  new  energy  and  angular  momentum  to  the  old. 

b)  Describe  the  subsequent  motion  of  the  satellite  and  plot  T(r),  U(r ),  the  net  effective  potential,  and  E(r ) 
after  the  rocket  fires. 

5.  Two  identical  point  objects,  each  of  mass  m are  bound  by  a linear  two-body  force  F — —kr  where  r is  the 
vector  distance  between  the  two  point  objects.  The  two  point  objects  each  slide  on  a horizontal  frictionless 
plane  subject  to  a vertical  gravitational  field  g.  The  two-body  system  is  free  to  translate,  rotate  and  oscillate 
on  the  surface  of  the  frictionless  plane. 

a)  Derive  the  Lagrangian  for  the  complete  system  including  translation  and  relative  motion. 

b)  Use  Noether’s  theorem  to  identify  all  constants  of  motion. 

c)  Use  the  Lagrangian  to  derive  the  equations  of  motion  for  the  system. 

d)  Derive  the  generalized  momenta  and  the  corresponding  Hamiltonian. 

e)  Derive  the  period  for  small  amplitude  oscillations  of  the  relative  motion  of  the  two  masses. 

6.  A bound  binary  star  system  comprises  two  spherical  stars  of  mass  mi  and  m2  bound  by  their  mutual  gravita- 
tional attraction.  Assume  that  the  only  force  acting  on  the  stars  is  their  mutual  gravitation  attraction  and  let 
r be  the  instantaneous  separation  distance  between  the  centers  of  the  two  stars  where  r is  much  larger  than 
the  sum  of  the  radii  of  the  stars. 

a)  Show  that  the  two-body  motion  of  the  binary  star  system  can  be  represented  by  an  equivalent  one-body  system 
and  derive  the  Lagrangian  for  this  system. 

b)  Show  that  the  motion  for  the  equivalent  one-body  system  in  the  center  of  mass  frame  lies  entirely  in  a plane 
and  derive  the  angle  between  the  normal  to  the  plane  and  the  angular  momentum  vector. 

c)  Show  whether  Hcm  is  a constant  of  motion  and  whether  it  equals  the  total  energy. 

d)  It  is  known  that  a solution  to  the  equation  of  motion  for  the  equivalent  one-body  orbit  for  this  gravitational 
force  has  the  form 

- = - [1  + e cos  9} 

r l- 

and  that  the  angular  momentum  is  a constant  of  motion  L = l.  Use  these  to  prove  that  the  attractive  force  leading 
to  this  bound  orbit  is 


where  k must  be  negative. 
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7 When  performing  the  Rutherford  experiment,  Gieger  and  Marsden  scattered  l.lMeV  4He  particles  (alpha 
particles)  from  238U  at  a scattering  angle  in  the  laboratory  frame  of  9 = 90°.  Derive  the  following  observables 
as  measured  in  the  laboratory  frame. 

(a)  The  recoil  scattering  angle  of  the  238U  in  the  laboratory  frame. 

(b)  The  scattering  angles  of  the  4He  and  238U  in  the  center-of-mass  frame 

(c)  The  kinetic  energies  of  the  4He  and  238U  in  the  laboratory  frame 

(d)  The  impact  parameter 

(e)  The  distance  of  closest  approach  rm;n 


Chapter  10 


Non- inertial  reference  frames 


10.1  Introduction 

Newton’s  Laws  of  motion  apply  only  to  inertial  frames  of  reference.  Inertial  frames  of  reference  make  it 
possible  to  use  Newton’s  laws  of  motion,  or  Lagrangian,  or  Hamiltonian  mechanics,  to  develop  the  necessary 
equations  of  motion.  There  are  certain  situations  where  it  is  much  more  convenient  to  treat  the  motion 
in  a non-inertial  frame  of  reference.  Examples  are  motion  in  frames  of  reference  undergoing  translational 
acceleration,  rotating  frames  of  reference,  or  frames  undergoing  both  translational  and  rotational  motion. 
This  chapter  will  analyze  the  behavior  of  dynamical  systems  in  accelerated  frames  of  reference,  especially 
rotating  frames  such  as  on  the  surface  of  the  Earth.  Newtonian  mechanics,  as  well  as  the  Lagrangian  and 
Hamiltonian  approaches,  will  be  used  to  handle  motion  in  non-inertial  reference  frames  by  introducing  extra 
inertial  forces  that  correct  for  the  fact  that  the  motion  is  being  treated  with  respect  to  a non-inertial  reference 
frame.  These  inertial  forces  are  often  called  fictitious  even  though  they  appear  real  in  the  non-inertial  frame. 
The  underlying  reasons  for  each  of  the  inertial  forces  will  be  discussed  followed  by  a presentation  of  important 
applications. 

10.2  Translational  acceleration  of  a reference  frame 

Consider  an  inertial  system  ( Xfix,Ufix,Zfix ) which  is  fixed 
in  space,  and  a non-inertial  system  (x'mov,  y'mov,  z'mov)  that 
is  moving  in  a direction  relative  to  the  fixed  frame  such  as 
to  maintain  constant  orientations  of  the  axes  relative  to  the 
fixed  frame,  as  illustrated  in  figure  10.1.  The  fixed  frame  is 
designated  to  be  the  unprimed  frame  and,  to  avoid  confu- 
sion the  subscript  fix  is  attached  to  the  fixed  coordinates 
taken  with  respect  to  the  fixed  coordinate  frame.  Similarly, 
the  translating  reference  frame,  which  is  undergoing  trans- 
lational acceleration,  has  the  subscript  mov  attached  to  the 
coordinates  taken  with  respect  to  the  translating  frame  of 
reference.  Newton’s  Laws  of  motion  are  obeyed  only  in  the 
inertial  (unprimed)  reference  frame.  The  respective  position 
vectors  are  related  by 

v fix  = R-/ia;+rmoi’  (10.1) 

where  is  the  vector  relative  to  the  fixed  frame,  r'mov  is 
the  vector  relative  to  the  translationally  accelerating  frame 
and  Rfix  is  the  vector  from  the  origin  of  the  fixed  frame  to 
the  origin  of  the  accelerating  frame.  Differentiating  equation 
10.1  gives  the  velocity  vector  relation 

v fix  — V fixJr~vrnov  (10.2) 


^moving 


Figure  10.1:  Inertial  reference  frame  (un- 
primed) , and  translational  accelerating  frame 
(primed). 
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where  Vfix 


drfix  / 
dt  ’ v mov 


dr' 

LLL  mov 

dt 


and  V fix  = 


c£R  fix 

dt 


Similarly  the  acceleration  vector  relation  is 


fix  — A.  fix  ~\~  cl.m 


where  a f 


a T.f'*  rJ  a Tmov  ar,a  A ...  - 

dt 2 > dmov  — dt.2  dI1U  — 

In  the  fixed  frame,  Newton’s  laws  give  that 


d2TLf 

~dt2 


(10.3) 


F fix  — 7713.  fix  (10.4) 

The  force  in  the  fixed  frame  can  be  separated  into  two  terms,  the  acceleration  of  the  accelerating  frame  of 
reference  Afix  plus  the  acceleration  with  respect  to  the  accelerating  frame  3'mov. 

F fix  rmAfix~\~77l3rrlov  (10.5) 

Relative  to  the  accelerating  reference  frame  the  acceleration  is  given  by 

uramot,  — F fix  TiiA  fix  (10.6) 

The  accelerating  frame  of  reference  can  exploit  Newton’s  Laws  of  motion  using  an  effective  translational 
force  E’tran  = E fix  ~ 7nAfix.  The  additional  — mA/jx  term  is  called  an  inertial  force;  it  can  be  altered  by 
choosing  a different  non-inertial  frame  of  reference,  that  is,  it  is  dependent  on  the  frame  of  reference  in  which 
the  observer  is  situated. 


10.3  Rotating  reference  frame 

Consider  a rotating  frame  of  reference  which  will  be  designated  as  the  double-primed  (rotating)  frame 
to  differentiate  it  from  the  non-rotating  primed  (moving)  frame,  since  both  of  which  may  be  undergoing 
translational  acceleration  relative  to  the  inertial  fixed  unprimed  frame  as  described  above. 


10.3.1  Spatial  time  derivatives  in  a rotating,  non-translating,  reference  frame 


For  simplicity  assume  that  R fix  = V /*x  = 0,  that  is,  the 
primed  reference  frame  is  stationary  and  identical  to  the  fixed 
stationary  unprimed  frame.  The  double-primed  (rotating) 
frame  is  a non-inertial  frame  rotating  with  respect  to  the 
origin  of  the  fixed  primed  frame.  Appendix  JD.2.3  shows  that 
an  infinitessimal  rotation  dd  about  an  instantaneous  axis  of 
rotation  leads  to  an  infinitessimal  displacement  drR  where 

d,rR  = dQ  x r'mov  (10.7) 

Consider  that  during  a time  dt , the  position  vector  in  the  fixed 
primed  reference  frame  moves  by  an  arbitrary  infinitessimal 
distance  dr'mov.  As  illustrated  in  figure  10.2,  this  infinitessi- 
mal distance  in  the  primed  non-rotating  frame  can  be  split 
into  two  parts: 

a)  drR  = dOx  r'mov  which  is  due  to  rotation  of  the  rotating 
frame  with  respect  to  the  translating  primed  frame. 

b)  (dr"ot)  which  is  the  motion  with  respect  to  the  rotating 
(double-primed)  frame. 

That  is,  the  motion  has  been  arbitrarily  divided  into 
a part  that  is  due  to  the  rotation  of  the  double-primed 
frame,  plus  the  vector  displacement  measured  in  this  rotating 
(double-primed)  frame.  It  is  always  possible  to  make  such  a 
decomposition  of  the  displacement  as  long  as  the  vector  sum 
can  be  written  as 

drniov  = dvrot  + dOx  r'mov  (10.8) 


dr'  . 

moving 


Figure  10.2:  Infinitessimal  displacement  in 
the  non  rotating  primed  frame  and  in  the  ro- 
tating double-primed  reference  frame  frame. 
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Since  dO  = u)dt  then  the  time  differential  of  the  displacement,  equation  10.8,  can  be  written  as 

dr'\  ( dr"\ 

dJ-J = +“xr" 


(10.9) 


The  important  conclusion  is  that  a velocity  measured  in  a non-rotating  reference  frame  ( ) can  be 

V / mov 

expressed  as  the  sum  of  the  velocity  ( ) , measured  relative  to  a rotating  frame,  plus  the  term  us  x r’mov 

which  accounts  for  the  rotation  of  the  frame.  The  division  of  the  dr’rot  vector  into  two  parts,  a part  due  to 
rotation  of  the  frame  plus  a part  with  respect  to  the  rotating  frame,  is  valid  for  any  vector  as  shown  below. 


10.3.2  General  vector  in  a rotating,  non-translating,  reference  frame 

Consider  an  arbitrary  vector  G which  can  be  expressed  in  terms  of  components  along  the  three  unit  vector 
basis  ejtx  in  the  fixed  inertial  frame  as 

3 

G = ]T  G{ixe{ix  (10.10) 

i=l 

Neglecting  translational  motion,  then  it  can  be  expressed  in  terms  of  the  three  unit  vectors  in  the  non-inertial 
rotating  frame  unit  vector  basis  exot  as 


E^r 


i=  1 


Since  the  unit  basis  vectors  e\ot  are  constant  in  the  rotating  frame,  that  is, 

'del01' 


dt 


= 0 


then  the  time  derivatives  of  G in  the  rotating  coordinate  system  e[ot  can  be  written  as 

3 


dG  \ I Uer; 

~dt  I ~ 2-;  \ ~dT 


dt  J 


i—l 


dG; 


dt 


The  inertial-frame  time  derivative  taken  with  components  along  the  rotating  coordinate  basis  e 
10.11,  is 

3 / 7^-,  \ 3 l-irot 


rot 
i ? 


) -t 


dG 

dt  J fix 


i=  1 


dGt 

dt 


; i= 1 


de^_ 

rot  dt 


Substitute  the  unit  vector  erot  for  r'mov  in  equation  10.9,  plus  using  equation  10.12,  gives  that 

' derot ' 


dt 


= uxe 


fix 


Substitute  this  into  the  second  term  of  equation  10.14  gives 

dG\  _ fdG\ 


dt  J 


fix 


\ dt  J , 


cu  x G 


(10.11) 

(10.12) 

(10.13) 
equation 

(10.14) 

(10.15) 

(10.16) 


This  important  identity  relates  the  time  derivatives  of  any  vector  expressed  in  both  the  inertial  frame  and 
the  rotating  non-inertial  frame  bases.  Note  that  the  u>  x G term  originates  from  the  fact  that  the  unit 
basis  vectors  of  the  rotating  reference  frame  are  time  dependent  with  respect  to  the  non-rotating  frame  basis 
vectors  as  given  by  equation  (10.15).  Equation  (10.16)  is  used  extensively  for  problems  involving  rotating 
frames.  For  example,  for  the  special  case  where  G = r7,  then  equation  (10.16)  relates  the  velocity  vectors  in 
the  fixed  and  rotating  frames  as  given  in  equation  (10.9). 

As  another  example,  consider  the  vector  ii> 


did\ 


dt  J 


fix 


dt u 


dt 


— = — + x = — =cu 


du\ 


dt  / 


(10.17) 


That  is,  the  angular  acceleration  w has  the  same  value  in  both  the  fixed  and  rotating  frames  of  reference. 
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10.4  Reference  frame  undergoing  rotation  plus  translation 


Consider  the  case  where  the  system  is  accelerating  in  translation  as  well  as  rotating,  that  is,  the  primed 
frame  is  the  non-rotating  translating  frame.  The  position  vector  r fix  is  taken  with  respect  to  the  inertial 
fixed  unprimed  frame  which  can  be  written  in  terms  of  the  fixed  unit  basis  vectors  (i/iX)  jfix,  ^fix)-  This  r fix 
vector  can  be  written  as  the  vector  sum  of  the  translational  motion  R fix  of  the  origin  of  the  rotating  system 
with  respect  to  the  fixed  frame  plus  the  position  r'mov  with  respect  to  this  translating  primed  frame  basis 

r fix  = R/ia;  + rmov  (10.18) 


The  time  differential  is 


(10.19) 


The  vector  dr'  is  the  position  with  respect  to  the  translating  frame  of  reference  which  can  be  expressed  in 
terms  of  the  unit  vectors  (i'mov, l' mov  k'™™)  • 


Equation  10.19  takes  into  account  the  translational  motion  of  the  moving  primed  frame  basis.  Now, 
assuming  that  the  double  primed  frame  rotates  about  the  origin  of  the  moving  primed  frame,  then  the  net 
displacement  with  respect  to  the  original  inertial  frame  basis  can  be  combined  with  equation  10.9  leading  to 
the  relation 


d-  lj  x 


(10.20) 


Here  the  double-primed  frame  is  both  rotating  and  translating.  Vectors  in  this  frame  are  expressed  in  terms 
of  the  unit  basis  vectors  (i"rot ,j"rot’  Wrot  \ ■ 

Expressed  as  velocities,  equation  10.20  can  be  written  as 


v fix 


= V 


fix 


u)  x r „ 


(10.21) 


where: 

v fix  is  the  velocity  measured  with  respect  to  the  inertial  (unprimed)  frame  basis. 

V fix  is  the  velocity  of  the  origin  of  the  non-inertial  translating  (primed)  frame  basis  with  respect  to  the 
origin  of  the  inertial  (unprimed)  frame  basis. 

v"ot  is  the  velocity  of  the  particle  with  respect  to  the  non-inertial  rotating  (double-primed)  frame  basis 
the  origin  of  which  is  both  translating  and  rotating. 

oj  x r'mov  is  the  motion  of  the  rotating  (double-primed)  frame  with  respect  to  the  linearly-translating 
(primed)  frame  basis. 

Thus  this  relation  takes  into  account  both  the  translational  velocity  plus  rotation  of  the  reference  coor- 
dinate frame  basis  vectors. 


10.5  Newton’s  law  of  motion  in  a non-inertial  frame 


The  acceleration  of  the  system  in  the  rotating  inertial  frame  can  be  derived  by  differentiating  the  general 
velocity  relation  for  v,  equation  10.21,  in  the  fixed  frame  basis  which  gives 


3-fix 


dvfi: 


fix 


dt 


fixed 


rlV 


fix 


dt 


+ 


fixed 


dVl 


dt 


fixed 


du)\ 
dt  ) 


x r. 


tw  x 


fixed 


dr' 


dt 


(10.22) 


fixed 


Now  we  wish  to  use  the  general  transformation  to  a rotating  frame  basis  which  requires  inclusion  of  the  time 
dependence  of  the  unit  vectors  in  the  rotating  frame,  that  is, 


(dw 


.// 

rot 


\ dt 


( dv. 


du>\ 
dt  ) 


x r: 


fixed 

/ 


UJ  X 


fixed 

dvf 

^ mov 

dt 


\ dt 
div\ 

dt  J 


ui  x v . 


.// 


rotating 

X r'  , 


= w X v"ot  + u>  x (a>  x r'mov) 


(10.23) 

(10.24) 

(10.25) 
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Using  equations  10.23, 10.24, 10.25  gives 

a fix  = A.  fix  + a ”ot  +2ux  v"ot  + u>  x (w  x r'mov)  + w x r'm 

dv" 


where  the  acceleration  in  the  rotating  frame  is  a"ot  = 


dt 


while  the  velocity  is  v"ot  = 


dr „ 


dt 


A fix  is  with  respect  to  the  fixed  frame. 

Newton’s  laws  of  motion  are  obeyed  in  the  inertial  frame,  that  is 

F fix  = ma. fix  = m (Afix  + a"ot  +2ux  V'ot  + u x (u>  x r'mov)  + u x r’mov) 


(10.26) 

and 

rot 

(10.27) 


In  the  double-primed  frame,  which  may  be  both  rotating  and  accelerating  in  translation,  one  can  ascribe  an 
effective  force  F(r{J  that  obeys  an  effective  Newton’s  law  for  the  acceleration  a"ot  in  the  rotating  frame 


F^ot  ma.rot  F tyi  (Ajjx  ^ ^ rot  A x (cj  x ^rnov)  T d?  x r*motJ) 


(10.28) 


Note  that  the  effective  force  FffJ  comprises  the  physical  force  F fixed,  minus  four  non-inertial  forces  that  are 
introduced  to  correct  for  the  fact  that  the  rotating  reference  frame  is  a non-inertial  frame. 


10.6  Lagrangian  mechanics  in  a non-inertial  frame 

The  above  derivation  of  the  equations  of  motion  in  the  rotating  frame  is  based  on  Newtonian  mechanics. 
Lagrangian  mechanics  provides  another  derivation  of  these  equations  of  motion  for  a rotating  frame  of 
reference  by  exploiting  the  fact  that  the  Lagrangian  is  a scalar  which  is  frame  independent,  that  is,  it  is 
invariant  to  rotation  of  the  frame  of  reference. 

The  Lagrangian  in  any  frame  is  given  by 

L = ^mv  • v — U(r)  (10.29) 

The  scalar  product  v • v is  the  same  in  any  rotated  frame  and  can  be  evaluated  in  terms  of  the  rotating 
frame  variables  using  the  same  decomposition  of  the  translational  plus  rotational  motion  as  used  previously 
and  given  in  equation  10.21. 

Equation  (10.21)  decomposes  the  velocity  in  the  fixed  inertial  frame  ~Vfix  into  four  vector  terms,  the 
translational  velocity  ~V fix  of  the  translating  frame,  the  velocity  in  the  rotating-translating  frame  v"ot,  and 
rotational  velocity  (u>  x r'mov).  Using  equations  10.29  and  10.21,  plus  appendix  equation  B. 21  for  the  triple 
products,  gives  that  the  Lagrangian  evaluated  using  Vfix-Vfix  equals 


L=-m 


v fix-V  fte+v'rot-v'lot  + 2 Vfix-v"ot  + TV  fix  ■ (u:  x v'mov)  + 2v"ot  • (u  x r'mov)  + (u>  x r(Tlo„)“J -U{r) 

(10.30) 


This  can  be  used  to  derive  the  canonical  momentum  in  the  rotating  frame 

dL 


P rot  = 


dv'i 


= m \Vfix+v"ot  + lj  x r'm 


(10.31) 


The  Lagrange  equations  can  be  used  to  derive  the  equations  of  motion  in  terms  of  the  variables  evaluated 
in  the  rotating  reference  frame.  The  required  Lagrange  derivatives  are 


d dL 


= m [Afix+a.'r0t  + (^  x v"ot)  + (w  x r 'mov)\r 


(10.32) 


and 


dL 
dr • 


7 = ^ rn  [(u>  x V fix)  - (u>  x v"ot)  - w x (u>  x r'm 


VU 


(10.33) 


where  the  scalar  triple  product,  equation  13.21,  has  been  used.  Thus  the  Lagrange  equations  give  for  the 
rotating  frame  basis  that 


marot  = ~ 'VU  - m[Afix+  (w  x N f^)  +2  (w  x v"ot)  + w x (w  x r'mov)  + (d>  x r'mov)\rot 


(10.34) 
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The  external  force  is  identified  as  F fixed  = — VU.  Equation  10.16  can  be  used  to  transform  between  the 
fixed  and  the  rotating  bases. 

A. fix  = A.fix+  (cj  x V) fix  (10.35) 

L J J rot 

This  leads  to  an  effective  force  in  the  non-inertial  translating  plus  rotating  frame  that  corresponds  to  an 
effective  Newtonian  force  of 

Kit  = marot  = F - rn[Afix  +2ux  v"ot  + u x (w  x r'mov)  + (w  x r'mov)\  (10.36) 

where  A fiX  is  expressed  in  the  fixed  frame.  The  derivation  of  equation  10.36  using  Lagrangian  mechanics, 
confirms  the  identical  formula  10.29  derived  using  Newtonian  mechanics. 

The  four  correction  terms  for  the  non-inertial  frame  basis  correspond  to  the  following  effective  forces. 
Translational  acceleration:  = —mAfix  is  the  usual  inertial  force  experienced  in  a linearly  acceler- 

ating frame  of  reference,  and  where  Afix  is  with  respect  to  the  fixed  frame  . 

Coriolis  force;  Ff/J  = -2mw  x v"oi  This  is  a new  type  of  inertial  force  that  is  present  only  when  a 
particle  is  moving  in  the  rotating  frame.  This  force  is  proportional  to  the  velocity  in  the  rotating  frame  and 
is  independent  of  the  position  in  the  rotating  frame 

Centrifugal  force:  FeJ^  = —row  x (w  x r'mov)  This  is  due  to  the  centripetal  acceleration  of  the  particle 
owing  to  the  rotation  of  the  moving  axis  about  the  axis  of  rotation. 

Transverse  (azimuthal)  force:  F eJJ  = —row  x r'mov  This  is  a straightforward  term  due  to  acceleration  of 
the  particle  due  to  the  angular  acceleration  of  the  rotating  axes. 

The  above  inertial  forces  are  correction  terms  arising  from  trying  to  extend  Newton’s  laws  of  motion  to 
a non-inertial  frame  involving  both  translation  and  rotation.  These  correction  forces  are  often  referred  to  as 
“fictitious”  forces.  However,  these  non-inertial  forces  are  very  real  when  located  in  the  non-inertial  frame. 
Since  the  centrifugal  and  Coriolis  terms  are  unusual  they  are  discussed  below. 

10.7  Centrifugal  force 

The  centrifugal  force  was  defined  as 

F Cf  = ~mu>  x (w  x r'mov)  (10.37) 

Note  that 

w ■ Fcf  = 0 (10.38) 

therefore  the  centrifugal  force  is  perpendicular  to  the  axis  of 
rotation. 

Using  the  vector  identity,  equation  73.24,  allows  the  centrifu- 
gal force  to  be  written  as 

Fcf  = ~m  [(w  ■ r'm0v ) w - u2r'mov]  (10.39) 

For  the  case  where  the  radius  r'  is  perpendicular  to  w then  w-r'  = 

0 and  thus  for  this  case 

Fcf  = nnJ1  r'mov  (10.40) 

The  centrifugal  force  is  experienced  when  a car  is  driven 
rapidly  around  a bend.  The  passenger  experiences  an  apparent 
centrifugal  (center  fleeing)  force  that  thrusts  them  to  the  outside 
of  the  bend  relative  to  the  inside  of  the  turning  car.  In  reality, 
relative  to  the  fixed  inertial  frame,  i.e.  the  road,  the  friction  be- 
tween the  car  tires  and  the  road  is  changing  the  direction  of  the  Figure  10.3:  Centrifugal  force, 

car  towards  the  inside  of  the  bend  and  the  car  seat  is  causing 
the  centripetal  (center  seeking)  acceleration  of  the  passenger.  A 
bucket  of  water  attached  to  a rope  can  be  swung  around  in  a 
vertical  plane  without  spilling  any  water  if  the  centrifugal  force 
exceeds  the  gravitation  force  at  the  top  of  the  trajectory. 
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10.8  Coriolis  force 

The  Coriolis  force  was  defined  to  be 

Fcor  = -2 mu>  x v"ot  (10.41) 

where  v"  is  the  velocity  measured  in  the  ro- 
tating (double-primed)  frame.  The  Coriolis 
force  is  an  interesting  force;  it  is  perpendic- 
ular to  both  the  axis  of  rotation  and  the  ve- 
locity vector  in  the  rotating  frame,  that  is,  it 
is  analogous  to  the  qv  x B Lorentz  magnetic 
force  . 

The  understanding  of  the  Coriolis  effect 
is  facilitated  by  considering  the  physics  of  a 
hockey  puck  sliding  on  a rotating  frictionless 
table.  Assume  that  the  table  rotates  with 
constant  angular  frequency  w = w k about 
the  2 axis.  For  this  system  the  origin  of  the 
rotating  system  is  fixed,  and  the  angular  frequency  is  constant,  thus  A and  wxr'  are  zero.  Also  it  is  assumed 
that  there  are  no  external  forces  acting  on  the  hockey  puck,  thus  the  net  acceleration  of  the  puck  sliding  on 
the  table,  as  seen  in  the  rotating  frame,  simplifies  to 

arot  = -2w  x v"ot  - UJ  x (u  x r'mov)  = — 2cak  x v"ot  + ui2 r'mov  (10.42) 

The  centrifugal  acceleration  +ui2 r’mov  is  radially  outwards  while  the  Coriolis  acceleration  — 2wk  x v"ot  is  to 
the  right.  Integration  of  the  equations  of  motion  can  be  used  to  calculate  the  trajectories  in  the  rotating 
frame  of  reference. 

Figure  10.4  illustrates  trajectories  of  the  hockey  puck  in  the  rotating  reference  frame  when  no  external 
forces  are  acting,  that  is,  in  the  inertial  frame  the  puck  moves  in  a straight  line  with  constant  velocity  vq. 
In  the  rotating  reference  frame  the  Coriolis  force  accelerates  the  puck  to  the  right  leading  to  trajectories 
that  exhibit  spiral  motion.  The  apparent  complicated  trajectories  are  a result  of  the  observer  being  in  the 
rotating  frame  for  which  that  the  straight  inertial-frame  trajectories  of  the  moving  puck  exhibit  a spiralling 
trajectory  in  the  rotating-frame. 

The  Coriolis  force  is  the  reason  that  winds  circulate  in  an  anticlockwise  direction  about  low-pressure 
regions  in  the  Earth’s  northern  hemisphere.  It  also  has  important  consequences  in  many  activities  on  earth 
such  as  ballet  dancing,  ice  skating,  acrobatics,  nuclear  and  molecular  rotation,  and  the  motion  of  missiles. 


Figure  10.4:  Free-force  motion  of  a hockey  puck  sliding  on 
a rotating  frictionless  table  of  radius  R that  is  rotating  with 
constant  angular  frequency  u>  out  of  the  page. 


10.1  Example:  Accelerating  spring  plane  pendulum 


Comparison  of  the  relative  merits  of  using  a non-inertial  frame  versus  an  inertial  frame  is  given  by  a 
spring  pendulum  attached  to  an  accelerating  fidcrum.  As  shown  in  the  figure,  the  spring  pendulum  comprises 
a mass  m attached  to  a massless  spring  that  has  a rest  length  ro  and  spring  constant  k.  The  system  is 
in  a vertical  gravitational  field  g and  the  fulcrum  of  the  pendulum  is  accelerating  vertically  upwards  with  a 
constant  acceleration  a.  Assume  that  the  spring  pendulum  oscillates  only  in  the  vertical  6 plane. 

Inertial  frame: 

This  problem  can  be  solved  in  the  fixed  inertial  coordinate  system  with  coordinates  (x,y).  These  coordi- 
nates, and  their  time  derivatives,  are  given  in  terms  of  r and  9 by 


x = r sin  0 
y = — r cos  9 +^at2 


x = f sin  9 + r9  cos  9 
y = r9  sin  9 — f cos  9 + at 


Thus 


L = 


\m  (a:2  + y 2)  - mgy  - ^ k(r  - r0)2 


1 


— 

2 


r2  + r2#2  + a2t 2 + 2 at  ( rd  sin  9 — r cos  0 


+ mg  ( r cos 9 — (r  — ro)2 
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The  Lagrange  equations  of  motion  are  given  by 
A rL  = 0 


A aL  = 0 


The  generalized  momenta  are 


• 2 k 

f — rO  — (a  + g)  cos  9 H (r  — ro)  = 0 


i + ?r0+^±^sin0  = O 
r r 


dL 

pr  = — - = mr  — mat  cos  0 

or 

dL  o;, 

Pq  = — -=  mr  9 + matr  sin  ( 

06 


These  lead  to  the  corresponding  velocities  of 


6 = 


Pr  , Q 

h at  cos  6 

m 

pe  at  sin  9 
mr 2 r 


and  thus  the  Hamiltonian  is  given  by 


H = prf  + p$9  — L 

Pr 


1 — — — — —pg  sin  9 + atpr  cos  9 + -k  (r  — ro)2  + - mgat 2 — mgr  cos  9 

2 to  2wH  r 2 2 


T/ze  Hamilton  equations  of  motion  give  that 


dH  Pr  ^ + 

r = — — = 1-  at  cos  0 

ctyv  TO 


0 = 


(9i7  Pe  at  sin  / 
dp  e mr2  r 


These  radial  and  angidar  velocities  are  the  same  as  obtained  using  Lagrangian  mechanics. 

The  Hamilton  equations  for  pr  and  pe  are  given  by 

OH  at  . . , . . . pi 

pr  = - t pe  sin  0 - k (r  - r0)  + mg  cos  0 H ^ 

ar  rnrJ 

Similarly 

OH  at  . . 

pe  = — — = — pe  cos  0 + atpr  sin  0 — mgr  sin  0 
09  r 

The  transformation  equations  relating  the  generalized  coordinates  r , 0 are  time  dependent  so  the  Hamil- 
tonian H does  not  equal  the  total  energy  E.  In  addition  neither  the  Lagrangian  nor  the  Hamiltonian  are 
conserved  since  they  both  are  time  dependent.  The  fact  that  the  Hamiltonian  is  not  conserved  is  obvious  since 
the  whole  system  is  accelerating  upwards  leading  to  increasing  kinetic  and  potential  energies.  Moreover,  the 
time  derivative  of  the  angular  momentum  pg  is  non-zero  so  the  angular  momentum  pg  is  not  conserved. 

Non-inertial  fulcrum  frame: 

This  system  also  can  be  addressed  in  the  accelerating  non-inertial  fulcrum  frame  of  reference  which  is 
fixed  to  the  fulcrum  of  the  spring  of  the  pendulum.  In  this  non-inertial  frame  of  reference,  the  acceleration 
of  the  frame  can  be  taken  into  account  using  an  effective  acceleration  a which  is  added  to  the  gravitational 
force;  that  is,  g is  replaced  by  an  effective  gravitational  force  (g  + a).  Then  the  Lagrangian  in  the  fidcrum 
frame  simplifies  to 

1 2 1 

L fulcrum  = -ffnir2  + r29  +m(g  + a)  (rcos9)  - -k(r-  r0f 
The  Lagrange  equations  of  motion  in  the  fulcrum  frame  are  given  by 
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fulcrum  — 0 


A-qL  fulcrum  — 0 


• 2 k 

r — rO  — (a  + g ) cos  6 H (r  — r0)  = 0 

m 


' + -fb+  ^±^sin0  = o 

r r 


These  are  identical  to  the  Lagrange  equations  of  motion  derived  in  the  inertial  frame. 
The  L fuicrum  can  he  used  to  derive  the  momenta  in  the  non-inertial  fulcrum  frame 


Pr 

Pe 


dL 


fulcrum 

dr 


dLf 


ulcrumr 


89 


= mf 


= mr29 


which  comprise  only  a part  of  the  momenta  derived  in  the  inertial  frame.  These  partial  fidcrum  momenta 
lead  to  a fulcum-frame  Hamiltonian 


H fulcrum  PrC  "t-  Pq9  Lfuicrurn  2 ^ ^ ^*o)  cn  i^g  -\-  a)  ? cos6» 

Both  L fuicrum  and  H fuicrUm  ire  time  independent  and  thus  the  fidcrum  Hamiltonian  H fuicrum  is  a constant 
of  motion  in  the  fidcrum  frame.  However,  H fulcrum  does  not  equal  the  total  energy  which  is  increasing  with 
time  due  to  the  acceleration  of  the  fulcrum  frame  relative  to  the  inertial  frame.  This  example  illustrates  that 
use  of  non-inertial  frames  can  simplify  solution  of  accelerating  systems. 


10.2  Example:  Surface  of  rotating  liquid 


Find  the  shape  of  the  surface  of  liquid  in  a bucket 
that  rotates  with  angular  speed  u>  as  shown  in  the  ad- 
jacent figure.  Assume  that  the  liquid  is  at  rest  in  the 
frame  of  the  bucket.  Therefore,  in  the  coordinate  system 
rotating  with  the  bucket  of  liquid,  the  Centrifugal  force  is 
important  whereas  the  Coriolis,  translational,  and  trans- 
verse forces  are  zero.  The  external  force 

F = F'  — mg 

where  F'  is  the  pressure  which  is  perpendicular  to  the 
surface.  At  equilibrium  the  acceleration  of  the  surface  is 
zero  that  is 


ma"  0 — - F^  m ^g  — lo  x (lo  x r')) 


The  effective  gravitational  force  is 

g eff  = (g  - w x (u>  x r')) 

which  must  be  perpendicular  to  the  surface  of  the  liquid  since  F'  is  perpendicular  to  the  surface  of  a fluid, 
and  the  net  force  is  zero.  In  cylindrical  coordinates  this  can  be  written  as 

g eff  = -gi  + pw2p 


From  the  figure  it  can  be  deduced  that 


By  integration 


tan  9 = 


dz 

dp 


2 

pujz 

9 


z = 


constant 
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This  is  the  equation  of  a paraboloid  and  corresponds  to  a parabolic  gravitational  equipotential  energy  surface. 
Astrophysicists  build  large  parabolic  mirrors  for  telescopes  by  continuously  spinning  a large  vat  of  glass  while 
it  solidifies.  This  is  much  easier  than  grinding  a large  cylindrical  block  of  glass  into  a parabolic  shape. 


10.3  Example:  The  pirouette 


An  interesting  application  of  the  Coriolis  force  is  the  problem  of  a spinning  ice  skater  or  ballet  dancer. 
Her  angular  frequency  increases  when  she  draws  in  her  arms.  The  conventional  explanation  is  that  angular 
momentum  is  conserved  in  the  absence  of  any  external  forces  which  is  correct.  Thus  since  her  moment  of 
inertia  decreases  when  she  retracts  her  arms,  her  angular  velocity  must  increase  to  maintain  a constant 
angular  momentum.  L = I u.  But  this  explanation  does  not  address  the  question  as  to  what  are  the  forces 
that  cause  the  angular  frequency  to  increase  ? The  real  radial  forces  the  skater  feels  when  she  retracts  her 
arms  cannot  directly  lead  to  angular  acceleration  since  radial  forces  are  perpendicular  to  the  rotation.  The 
following  derivation  shows  that  the  Coriolis  force  —2mu)  x v"ot  acts  tangentially  to  the  radial  retraction 
velocity  of  her  arms  leading  to  the  angidar  acceleration  required  to  maintain  constant  angular  momentum. 

Consider  that  a mass  m is  moving  radially  at  a velocity  r"ot  then  the  Coriolis  force  in  the  rotating  frame 
is 


= —2mu)  x 


This  Coriolis  force  leads  to  an  angular  acceleration  of  the  mass  of 


d)  = — 


2 u)  x f 


// 

rot 


r 


55 


(a) 


that  is,  the  rotational  frequency  decreases  if  the  radius  is  increased.  Note  that,  as  shown  in  equation  10.17, 
to  = to" . This  nonzero  value  of  uj  obviously  leads  to  an  azimuthal  force  in  addition  to  the  Coriolis  force. 
Consider  the  rate  of  change  of  angular  momentum  for  the  rotating  mass  m assuming  that  the  angular 
momentum  comes  purely  from  the  rotation  w.  Then  in  the  rotating  frame 


Po»  = — (mr  2uJ)  = 2mr"f"w  + mr"2ui 
Substituting  equation  a for  u>  in  the  second  term  gives 


pe..  = 2mr"  r"  a:— 2mr"r"u>  =0 


That  is,  the  two  terms  cancel.  Thus  the  angular  momentum  is  conserved  for  this  case  where  the  velocity  is 
radial.  Note  that,  since  pg”  is  assumed  to  be  colinear  with  w,  then  it  is  the  same  in  both  the  stationary  and 
rotating  frames  of  reference  and  thus  angidar  momentum  is  conserved  in  both  frames.  In  addition,  in  the 
fixed  frame,  the  angidar  momentum  is  conserved  if  no  external  torques  are  acting  as  assumed  above. 

Note  that  since  the  rotational  energy  is 

Hrot  ~ 

Also  the  angular  momentum  is  conserved,  that  is 


pg  = Ho  = lu> 


Substituting  co  = Ef-  in  the  rotational  energy  gives 


_P2e  l2 


771  _ rV  _ 

TdJrnt  — „ .■  — 


21  21 


Therefore  the  rotational  energy  actually  increases  as  the  moment  of  inertia  decreases  when  the  ice  skater 
pulls  her  arms  close  to  her  body.  This  increase  in  rotational  energy  is  provided  by  the  work  done  as  the 
dancer  pulls  her  arms  inward  against  the  centrifugal  force. 
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10.9  Routhian  reduction  for  rotating  systems 

The  Routhian  reduction  technique,  that  was  introduced  in  chapter  8.6,  is  a hybrid  variational  approach.  It 
was  devised  by  Routh  to  handle  the  cyclic  and  non-cyclic  variables  separately  in  order  to  simultaneously 
exploit  the  differing  advantages  of  the  Hamiltonian  and  Lagrangian  formulations.  The  Routhian  reduction 
technique  is  a powerful  method  for  handling  rotating  systems  ranging  from  galaxies  to  molecules,  or  deformed 
nuclei,  as  well  as  rotating  machinery  in  engineering.  A valuable  feature  of  the  Hamiltonian  formulation  is 
that  it  allows  elimination  of  cyclic  variables  which  reduces  the  number  of  degrees  of  freedom  to  be  handled. 
As  a consequence,  cyclic  variables  are  called  ignorable  variables  in  Hamiltonian  mechanics.  The  Lagrangian, 
the  Hamiltonian  and  the  Routhian  all  are  scalars  under  rotation  and  thus  are  invariant  to  rotation  of  the 
frame  of  reference.  Note  that  often  there  are  only  two  cyclic  variables  for  a rotating  system,  that  is,  0 = u> 
and  the  corresponding  canonical  total  angular  momentum  py  = J. 

As  mentioned  in  chapter  8.6,  there  are  two  possible  Routhians  that  are  useful  for  handling  rotation  frames 
of  reference.  For  rotating  systems  the  cyclic  Routhian  Rcyciic  simplifies  to 

Rcyciic (/-I]  i •••>  Qn'i  Qli  Qs S Ps+ 1 •>  — ; Pn'i  f)  = Rcyciic  Rnoncyclic  = U:  ■ J L (10.43) 

This  Routhian  behaves  like  a Hamiltonian  for  the  ignorable  cyclic  coordinates  J.  Simultaneously  it  behaves 
like  a negative  Lagrangian  Lnoncyciic  for  all  the  other  coordinates. 

The  non-cyclic  Routhian  Rn0ncyciic  complements  Rcyciic  in  that  it  is  defined  as 

Rnoncyclic  (.Ql ) • ■ • , Qn  i Pi  5 • ••  5 Ps  5 ljs+1 5 ? Qn  i t)  Rnoncyclic  E cyclic  — H Ui  • J (10.44) 

This  non-cyclic  Routhian  behaves  like  a Hamiltonian  for  all  the  non-cyclic  variables  and  behaves  like  a 
negative  Lagrangian  for  the  two  cyclic  variables  ui,pu.  Since  the  cyclic  variables  are  constants  of  motion, 
then  Rnoncyclic  is  a constant  of  motion  that  equals  the  energy  in  the  rotating  frame  if  H is  a constant  of 
motion.  However,  Rnoncyclic  does  not  equal  the  total  energy  since  the  coordinate  transformation  is  time 
dependent,  that  is,  the  Routhian  Rnoncyclic  corresponds  to  the  energy  of  the  non-cyclic  parts  of  the  motion. 

For  example,  the  Routhian  Rnoncyclic  for  a system  that  is  being  cranked  about  the  <p  axis  at  some  fixed 
angular  frequency  = ui,  with  corresponding  total  angular  momentum  = J,  can  be  written  as1 

Rnoncyclic  = R Ul  * J (10.45) 

= i?n  V • V + v”  • v”  + 2V  • v”  + 2V  • (u  x r')  + 2v”  • (w  x r')  + (u  x r')2  — w • J + U(r) 

Note  that  Rnoncyclic  is  a constant  of  motion  if  ^ = 0,  which  is  the  case  when  the  system  is  being  cranked 
at  a constant  angular  frequency.  However  the  Hamiltonian  in  the  rotating  frame  Hrot  = H — u>  ■ J is  given 
by  Rnoncyclic  = Hrot  7^  E since  the  coordinate  transformation  is  time  dependent.  The  canonical  Hamilton 
equations  for  the  fourth  and  fifth  terms  in  the  bracket  can  be  identified  with  the  Coriolis  force  2 mu:  x v", 
while  the  last  term  in  the  bracket  is  identified  with  the  centrifugal  force.  That  is,  define 

Ucf  = —^m(uj  x r')2  (10.46) 

where  the  gradient  of  Ucf  gives  the  usual  centrifugal  force. 

F cf  = —N?Ucf  = -^-V  u]2r'2  — (u>  • r')2  = m [w2r'  — (lj  ■ r')u>]  = —mu:  x (w  x r')  (10.47) 

The  Routhian  reduction  method  is  used  extensively  in  science  and  engineering  to  describe  rotational 
motion  of  rigid  bodies,  molecules,  deformed  nuclei,  and  astrophysical  objects.  The  cyclic  variables  describe 
the  rotation  of  the  frame  and  thus  the  Routhian  Rnoncyclic  — Hrot  corresponds  to  the  Hamiltonian  for  the 
non-cyclic  variables  in  the  rotating  frame. 

1 For  clarity  sections  10.1  to  10.8  of  this  chapter  adopted  a naming  convention  that  uses  unprimed  coordinates  with  the 

subscript  fix  for  the  inertial  frame  of  reference,  primed  coordinates  with  the  subscript  mov  for  the  translating  coordinates,  and 
double-primed  coordinates  with  the  subscript  rot  for  the  translating  plus  rotating  frame.  For  brevity  the  subsequent  discussion 
omits  the  redundant  subscripts  fix,  mov,  rot  since  the  single  and  double  prime  superscripts  completely  define  the  moving  and 
rotating  frames  of  reference. 
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10.4  Example:  Cranked  plane  pendulum 


The  cranked  plane  pendulum,  which  is  also  called  the  rotating  plane 
pendulum,  comprises  a plane  pendulum  that  is  cranked  around  a verti- 
cal axis  at  a constant  angular  velocity  </>  = ui  as  determined  by  some 
external  drive  mechanism.  The  parameters  are  illustrated  in  the  adja- 
cent figure.  The  cranked  pendulum  nicely  illustrates  the  advantages  of 
working  in  a non-inertial  rotating  frame  for  a driven  rotating  system. 
Although  the  cranked  plane  pendulum  looks  similar  to  the  spherical  pen- 
dulum,  there  is  one  very  important  difference;  for  the  spherical  pendulum 
Pcj,  = ml2  sin2  9cj)  is  a constant  of  motion  and  thus  the  angular  velocity 
varies  with  9,  i.e.  f>  = g , whereas  for  the  cranked  plane  pendulum, 

the  constant  of  motion  is  bp  = w and  thus  the  angular  momentum  varies 
with  9,  i.e.  p ^ = ls\n2  9oj.  For  the  cranked  plane  pendulum,  the  energy 
must  flow  into  and  out  of  the  cranking  drive  system  that  is  providing  the 
constraint  force  to  satisfy  the  equation  of  constraint 

94,  = <P  - w = 0 


g 

I 


Cranked  plane  pendulum  that  is 
cranked  around  the  vertical  axis 
with  angular  velocity  <p  = u>. 


The  easiest  way  to  solve  the  equations  of  motion  for  the  cranked  plane  pendulum  is  to  use  generalized  coor- 
dinates to  absorb  the  equation  of  constraint  and  applied  constraint  torque.  This  is  done  by  incorporating  the 
bp  = co  constraint  explicitly  in  the  Lagrangian  or  Hamiltonian  and  solving  for  just  9 in  the  rotating  frame. 

Assuming  that  bp  = u,  and  using  generalized  coordinates  to  absorb  the  cranking  constraint  forces,  then 
the  Lagrangian  for  the  cranked  pendulum  can  be  written  as. 


The  momentum  conjugate  to  9 is 


1 • 2 

L = -ml2 (9  + sin2  9co2)  + mgl  cos  9 


9L  Fh 
Pe  = — - = ml  9 

89 


Consider  the  Routhian  Rn0ncycUc  = Pe9  — L = H — p^bp  which  acts  as  a Hamiltonian  Hrot  in  the  rotating 
frame 

Rnoncyclic  Pq9  L H 


p2  1 

p^bp  = B — -ml2 u)2  sin2  9 — mgl  cos  9 

All  U A 


Note  that  if  bp  = u is  constant,  then  Rnoncyclic  Is  a constant  of  motion  for  rotation  about  the  <p  axis  since 
it  is  independent  of  <j>.  Also  — " °"tcvci,c  = — ^ = 0 thus  the  energy  in  the  rotating  non-inertial  frame  of  the 
pendulum  Rnoncyclic  = Hrot  = H — p^cp  is  a constant  of  motion,  but  it  does  not  equal  the  total  energy  since 
the  rotating  coordinate  transformation  is  time  dependent.  The  driver  that  cranks  the  system  at  a constant  co 
provides  or  absorbs  the  energy  dW  = dE  = uidp^  as  9 changes  in  order  to  maintain  a constant  u. 

The  Routhian  Rnoncyclic  can  be  used  to  derive  the  equations  of  motion  using  Hamiltonian  mechanics. 

q Rnoncyclic  PO 

ml2 


Pe  = - 


dpe 

bj  Rnoncyclic 


Since  pe  = ml29,  then  the  equation  of  motion  is 


sin0 


= —mgl  sin  9 


1 — — cos  9ui2 
9 


1 — - cos  9uj2 
9 


= 0 


(a) 


Assuming  that  sin0  « 9,  then  equation  a leads  to  linear  harmonic  oscillator  solutions  about  a minimum  at 


9 = 0 if  the  bracket  is  positive.  That  is,  when  the  bracket 
to  a harmonic  oscillator  with  angular  velocity  0 given  by 

l 


1-1  cos  9io2 
a 


> 0 then  equation  a corresponds 


n2  = - 

i 


1 — - cos  9co2 
9 
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The  adjacent  figure  shows  the  phase-space  diagrams  for  a plane 
pendulum  rotating  about  a vertical  axis  at  angular  velocity  to  for  (a) 
uj  < y/j  and  (b)  uj  > y/j.  The  upper  phase  plot  shows  small  uj  when 
the  square  bracket  of  equation  a is  positive  and  the  the  phase  space 
trajectories  are  ellipses  around  the  stable  equilibrium  point  (0,0). 
As  uj  increases  the  bracket  becomes  smaller  and  changes  sign  when 
uj2  cos  0 = j.  For  larger  uj  the  bracket  is  negative  leading  to  hyper- 
bolic phase  space  trajectories  around  the  {9,pe)  = (0,0)  equilibrium 
point,  that  is,  an  unstable  equilibrium  point.  However,  new  sta- 
ble equilibrium  points  now  occur  at  angles  (9,Pq)  = (±0q,O)  where 
cos  do  = A-  That  is,  the  equilibrium  point  (0,0)  undergoes  bifurca- 
tion as  illustrated  in  the  lower  figure.  These  new  equilibrium  points 
are  stable  as  illustrated  by  the  elliptical  trajectories  around  these 
points.  It  is  interesting  that  these  new  equilibi'ium  points  ±90  move 
to  larger  angles  given  by  cosO o = A beyond  the  bifurcation  point 
at  A-  = 1.  For  low  energy  the  mass  oscillates  about  the  minimum 
at  6 = 8 o whereas  the  motion  becomes  more  complicated  for  higher 
energy.  The  bifurcation  corresponds  to  symmetry  breaking  since, 
under  spatial  reflection,  the  equilibrium  point  is  unchanged  at  low 
rotational  frequencies  but  it  transforms  from  +0O  to  —9q  once  the 
solution  bifurcates,  that  is,  the  symmetry  is  broken.  Also  chaos  can 
occur  at  the  separatrix  that  separates  the  bifurcation.  Note  that 
either  the  Lagrange  multiplier  approach,  or  the  generalized  force  ap- 
proach, can  be  used  to  determine  the  applied  torque  required  to  ensure 


(a) 


(b) 

Phase-space  diagrams  for  the  plane 
pendulum  cranked  at  angular  velocity 
uj  about  a vertical  axis.  Figure  (a)  is 
for  uj  < f while  (6)  is  for  uj  > f. 

a constant  uj  for  the  cranked  pendulum. . 


10.5  Example:  Nucleon  orbits  in  deformed  nuclei 


Consider  the  rotation  of  axially- symmetric, 
prolate- deformed  nucleus.  Many  nuclei  have  a pro- 
late spheroidal  shape,  (the  shape  of  a rugby  ball) 
and  they  rotate  perpendicular  to  the  symmetry  axis. 
In  the  non-inertial  body-fixed  frame,  pairs  of  nucle- 
ons, each  with  angular  momentum  j , are  bound  in 
orbits  with  the  projection  of  the  angular  momentum 
along  the  symmetry  axis  being  conserved  with  value 
= K,  which  is  a cyclic  variable.  Since  the  nucleus 
is  of  dimensions  10-14m,  quantization  is  important 
and  the  quantized  binding  energies  of  the  individual 
nucleons  are  separated  by  spacings  < 500 keV. 

The  Lagrangian  and  Hamiltonian  are  scalars 
and  can  be  evaluated  in  any  coordinate  frame  of 
reference.  It  is  most  useful  to  calculate  the  Hamil- 
tonian for  a deformed  body  in  the  non-inertial  ro- 
tating body-fixed  frame  of  reference.  The  body- 
fixed  Hamiltonian  corresponds  to  the  Routhian 
Rnoncyclic 


X 


Schematic  diagram  for  the  strong  coupling  of  a 
nucleon  to  the  deformation  axis.  The  projection  of  I 
on  the  symmetry  axis  is  K , and  the  projection  of  j is 
ft.  For  axial  symmetry  Noether’s  theroem  gives  that 
the  projection  of  the  angular  momentum  K on  the 
symmetry  axis  is  a conserved  quantity. 


R 


noncyclic 


= H -uj- J 


where  it  is  assumed  that  the  deformed  nucleus  has  the  symmetry  axis  along  the  z direction  and  rotates  about 
the  x axis.  Since  the  Routhian  is  for  a non-inertial  rotating  frame  of  reference  it  does  not  include  the  total 
energy  but,  if  the  shape  is  constant  in  time,  then  RnoncycUc  and  the  corresponding  body-fixed  Hamiltonian 
are  conserved  and  the  energy  levels  for  the  nucleons  bound  in  the  spheroidal  potential  well  can  be  calculated 
using  a conventional  quantum  mechanical  model. 

For  a prolate  spheroidal  deformed  potential  well,  the  nucleon  orbits  that  have  the  angular  momentum 
nearly  aligned  to  the  symmetry  axis  correspond  to  nucleon  trajectories  that  are  restricted  to  the  narrowest 
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part  of  the  spheroid,  whereas  trajectories  with  the  angidar  momentum  vector  close  to  perpendicular  to  the 
symmetry  axis  have  trajectories  that  probe  the  largest  radii  of  the  spheroid.  The  Heisenberg  Uncertainty 
Principle,  mentioned  in  chapter  3.12,  describes  how  orbits  restricted  to  the  smallest  dimension  will  have 
the  highest  linear  momentum,  and  corresponding  kinetic  energy,  and  vise  versa  for  the  larger  sized  orbits. 
Thus  the  binding  energy  of  different  nucleon  trajectories  in  the  spheroidal  potential  well  depends  on  the  angle 
between  the  angidar  momentum  vector  and  the  symmetry  axis  of  the  spheroid  as  well  as  the  deformation  of 
the  spheroid.  A quantal  nuclear  model  Hamiltonian  is  solved  for  assumed  spheroidal-shaped  potential  wells. 
The  corresponding  orbits  each  have  angular  momenta  j i for  which  the  projection  of  the  angular  momentum 
along  the  symmetry  axis  f2j  is  conserved,  but  the  projection  of  j.;  in  the  laboratory  frame  jz  is  not  conserved 
since  the  potential  well  is  not  spherically  symmetric.  However,  the  total  Hamiltonian  is  spherically  symmetric 
in  the  laboratory  frame,  which  is  satisfied  by  allowing  the  deformed  spheroidal  potential  well  to  rotate  freely  in 
the  laboratory  frame,  and  then  jf,ji}Z,  and  f \ all  are  conserved  quantities.  The  attractive  residual  nucleon- 
nucleon  pairing  interaction  results  in  pairs  of  nucleons  being  bound  in  time-reversed  orbits  (j  x j)°,  that 
is,  with  resultant  total  spin  zero,  in  this  spheroidal  nuclear  potential.  Excitation  of  an  even-even  nucleus 
can  break  one  pair  and  then  the  total  projection  of  the  angular  momentum  along  the  symmetry  axis  is 
K = 112!  ±122|,  depending  on  whether  the  projections  are  parallel  or  antiparallel.  More  excitation  energy 
can  break  several  pairs  and  the  projections  continue  to  be  additive.  The  binding  energies  calculated  in  the 
spheroidal  potential  well  must  be  added  to  the  rotational  energy  Enot  = to  get  the  total  energy,  where 
J is  the  moment  of  inertia.  Nuclear  structure  measurements  are  in  good  agreement  with  the  predictions  of 
nuclear  structure  calculations  that  employ  the  Routhian  approach. 


10.10  Effective  gravitational  force  near  the  surface  of  the  Earth 


Consider  that  the  translational  acceleration  of  the  center  of 
the  Earth  can  be  neglected,  and  thus  a set  of  non-rotating 
axes  through  the  center  of  the  Earth  can  be  assumed  to  be 
approximately  an  inertial  frame.  The  effects  of  the  motion  of 
the  Earth  around  the  Sun,  or  the  motion  of  the  Solar  system 
in  our  Galaxy,  are  small  compared  with  the  effects  due  to  the 
rotation  of  the  Earth. 

Consider  a rotating  frame  attached  to  the  surface  of  the 
earth  as  shown  in  figure  10.5.  The  vector  with  respect  to  the 
center  of  the  Earth  r can  be  decomposed  into  a vector  to  the 
origin  of  the  reference  frame  fixed  to  the  surface  of  the  Earth 
R,  plus  the  vector  with  respect  to  this  surface  reference  frame 


r = R + r'  (10.48) 

If  the  external  force  is  separated  into  the  gravitational 
term  mg,  plus  some  other  physical  force  F,  then  the  acceler- 
ation in  the  non-inertial  surface  frame  of  reference  is 


a'  = hg— (A  + 2w  x v'  + w x (u  x r')  + iii  x r')  (10.49) 

m 


Figure  10.5:  Rotating  frame  at  the  surface  of 
the  Earth. 


But 


V = 


fixed 


rotating 


oj  x R = u)  x R 


since  in  the  rotating  frame  (^fr)  =0.  Also  the  acceleration 

0 \ at  / rotating 


A = 


+ o;xV  = a;x(a;xR) 


(10.50) 


fixed 


rotating 


(10.51) 


10.10.  EFFECTIVE  GRAVITATIONAL  FORCE  NEAR  THE  SURFACE  OF  THE  EARTH 


281 


since  {^~)rotating  = 0.  Substituting  this  into  the  above  equation  gives 


a'  = b g — (2a;  x v'  + u>  x (o>  x [r'  + R])  + w x r') 

F 

= b g — (2a;  x v'  + u x (uxr)luxr') 

m 

where  r is  with  respect  to  the  center  of  the  Earth.  This  is  as  expected  directly  from  equation  10.36.  Since 
the  angular  frequency  of  the  earth  is  a constant  then  w x r'  = 0.  Thus  the  acceleration  can  be  written  as 


m 


[g  — u x (w  x r)]  - 2w  x v' 


(10.52) 


The  term  in  the  square  brackets  combines  the  gravitational  acceleration  plus  the  centrifugal  acceleration. 


A measurement  of  the  Earth’s  gravitational  accel- 
eration actually  measures  the  term  in  the  square  brack- 
ets in  equation  10.52,  that  is,  an  effective  gravitational 
acceleration  where 

g eff  = g - a;  x (a;  x r)  (10.53) 

near  the  surface  of  the  earth  r « R.  The  effective  grav- 
itational force  does  not  point  towards  the  center  of  the 
Earth  as  shown  in  figure  10.6.  A plumb  line  points, 
or  an  object  falls,  in  the  direction  of  g e//.  The  shape 
of  the  earth  is  such  that  the  Earth’s  surface  is  per- 
pendicular to  g ef  f . This  is  the  reason  why  the  earth  is 
distorted  into  an  oblate  ellipsoid,  that  is,  it  is  flattened 
at  the  poles. 

The  angle  a between  ge//  and  the  line  pointing 
to  the  center  of  the  earth  is  dependent  on  the  latitude 
A = j — 0.  Note  that  the  colatitude  6 is  taken  to  be  zero 
at  the  North  pole  whereas  the  latitude  A is  taken  to 
be  zero  at  the  equator.  The  angle  a can  be  estimated 
by  assuming  that  r'  « R,  then  the  centrifugal  term 
then  can  be  approximated  by 

|«  x (w  x r)|  « uj2Rsin8  = a;2!?  cos  A (10.54) 


Figure  10.6:  Effective  gravitational  acceleration. 


This  is  quite  small  for  the  Earth  since  w = 0.73  x 10  4 rads/s  and  R = 6371fc?n,  leading  to  a correction 
term  a>2f?cosA  = 0.03 cosA  m/s2.  Since 


^horizontal 

9eff 


cu2R.  cos  A sin  A 


and 


_ vertical 
9eff 

Then  the  angle  a between  ge//  and  g is  given  by 


= g — lo2R  cos2  A 


(10.55) 

(10.56) 


a ~ tan  a = 


~ horizontal 
9eff 

„ vertical 
9eff 


u2R  cos  A sin  A 
g — w2l?cos2  A 


(10.57) 


This  has  a maximum  value  at  A = 45°  which  is  a = 0.0088°. 
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10.11  Free  motion  on  the  earth 


The  calculation  of  trajectories  for  objects  as  they  move  near 
the  surface  of  the  earth  is  frequently  required  for  many  ap- 
plications. Such  calculations  require  inclusion  of  the  non- 
inertial  Coriolis  force.  In  the  frame  of  reference  fixed  to 
the  earth’s  surface,  assuming  that  air  resistance  and  other 
forces  can  be  neglected,  then  the  acceleration  equals 


a =S  eff 


2w  x v' 


(10.58) 


Neglect  the  centrifugal  correction  term  since  it  is  very  small, 
that  is,  let  g eff  = g.  Using  the  coordinate  axis  shown  in 
figure  10.7,  the  surface-frame  vectors  have  components 


uj  = 0i'  + co  cos  Aj'  + u>  sin  Aid 


and 


g eff  = -ffk' 

Thus  the  Coriolis  term  is 


(10.59) 

(10.60) 


2w  x v'  = 2 


= 2 


y k' 

uj  cos  A to  sin  A 


. / . / 

x y 


. / 

0 


Figure  10.7:  Rotating  frame  fixed  on  the  sur- 
face of  the  Earth. 


(ljz  cos  A — ui y sin  A^  i'  + (wx  sin  A^j  j'  — ^ ojx  cos  A^j  k' 


Therefore  the  equations  of  motion  are 

mr'  = —mgk'—2m[i'(z'uj  cos  A — y'u>  sin  A)  + j'x'uj  sin  A — k'x'cu  cos  A] 
That  is,  the  components  of  this  equation  of  motion  are 


(10.61) 


x'  = — 2ix  (z'  cos  A — y'  sin  A)  (10.62) 

y'  = —2uxr  sin  A 
z’  = —g  + 2c ox'  cos  A 

Integrating  these  differential  equations  gives 

x'  = — 2ui  [z!  cos  A — y'  sin  A)  + x'0  (10.63) 

y'  = —2uix'  sin  A + y'0 
z!  = —gt  + 2uix'  cos  A + z'0 


where  x'0,y'0,  z'0  are  the  initial  velocities.  Substituting  the  above  velocity  relations  into  the  equation  of  motion 
for  x gives 

x'  = 2uigt  cos  X — 2(x  (z'0  cos  A — y'0  sin  A)  — (10.64) 

The  last  term  4cu2x  is  small  and  can  be  neglected  leading  to  a simple  uncoupled  second-order  differential 
equation  in  x.  Integrating  this  twice  assuming  that  x'0  = y'0  = z'0  = 0,  plus  the  fact  that  2ixgtcosX  and 
2oj  (z'0  cos  A — y'0  sin  A)  are  constant,  gives 


x'  = -ixgt3  cos  A — cot2  (zq  cos  A — y'0  sin  A)  + x'0t 

O 

Similarly, 

V'  = (Vo1  ~ ux'0 12  sin  A) 
z!  = — \gt2  + z'ot  + ux'0t2  cos  A 


(10.65) 

(10.66) 
(10.67) 


10.11.  FREE  MOTION  ON  THE  EARTH 


283 


Consider  the  following  special  cases; 

10.6  Example:  Free  fall  from  rest 

Assume  that  an  object  falls  a height  h starting  from  rest  at  t = 0,  x = 0,  y = 0,  z = h.  Then 

/ 1 , 
x = —coat  cos  A 
3 

y'  = o 


= h—  —gt 
2y 


Substituting  for  t gives 


1 


8 h3 


x'  = —u  cos  X\  — — 


Thus  the  object  drifts  eastward  as  a consequence  of  the  earth’s  rotation.  Note  that  relative  to  the  fixed  frame 
it  is  obvious  that  the  angular  velocity  of  the  body  must  increase  as  it  falls  to  compensate  for  the  reduced 
distance  from  the  axis  of  rotation  in  order  to  ensure  that  the  angular  momentum  is  conserved. 

10.7  Example:  Projectile  fired  vertically  upwards 

An  upward  fired  projectile  with  initial  velocities  x'0  = y'0  = 0 and  z'0  = vo  leads  to  the  relations 

x'  = l-tvgt3  cos  A — cot2vo  cos  A 

O 

y = o 

/ 1 O 

2 = ~2  9t  + v0t 

Solving  for  t when  z'  = 0 gives  t = 0,  and  t = - ^L.  Also  since  the  maximum  height  h that  the  projectile 
reaches  is  related  by 

then  the  final  deflection  is 

Thus  the  body  drifts  westwards. 


vo 


yjfgh. 


x'  = — -cj  cos  A 

O 


10.8  Example:  Motion  parallel  to  Earth’s  surface 

For  motion  in  the  horizontal  x'  — y'  plane  the  deflection  is  always  to  the  right  in  the  northern  hemisphere 
of  the  Earth  since  the  vertical  component  of  u is  upwards  and  thus  —27v  x v'  points  to  the  right.  In  the 
southern  hemisphere  the  vertical  component  of  to  is  downward  and  thus  —2  x v'  points  to  the  left.  This 
is  also  shown  using  the  above  relations  for  the  case  of  a projectile  fired  upwards  in  an  easterly  direction  with 
components  i0,0,  z0.  The  resultant  displacements  are 

x'  = ^-ujgt3  cos  A — ojt2z'0  cos  A + x'0t 

O 

Similarly, 

y'  = —uix'0t2  sin  A 
z'  = — -gt2  + z'0t  + wx'0f2  cos  A 

The  trajectory  is  non-planar  and,  in  the  northern  hemisphere,  the  projectile  drifts  to  the  right,  that  is 
southerly. 

In  the  battle  of  the  River  de  la  Plata,  during  World  War  2,  the  gunners  on  the  British  light  cruisers 
Exeter,  Ajax  and  Achilles  found  that  their  accurately  aimed  salvos  against  the  German  pocket  battleship  Graf 
Spee  were  falling  100  yards  to  the  left.  The  designers  of  the  gun  sighting  mechanisms  had  corrected  for  the 
Coriolis  effect  assuming  the  ships  would  fight  at  latitudes  near  50°  north,  not  50°  south. 
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10.12  Weather  systems 


Weather  systems  are  a classic  example  of  motion  in  a rotating  coordinate  system.  In  the  northern  hemisphere, 
air  flowing  into  a low-pressure  region  is  deflected  to  the  right  causing  counterclockwise  circulation,  whereas 
air  flowing  out  of  a high-pressure  region  is  deflected  to  the  right  causing  a clockwise  circulation.  Trade  winds 
on  the  Earth  result  from  air  rising  or  sinking  due  to  thermal  activity  combined  with  the  Coriolis  effect. 
Similar  behavior  is  observed  on  other  planets  such  as  the  Red  Spot  on  Jupiter. 

For  a fluid  or  gas,  equation  (10.36)  can  be  written  in  terms  of  the  fluid  density  p in  the  form 

pa”  = —VP  — p[2u>  x v”  - u x (cj  x r')]  (10.68) 


where  the  translational  acceleration  A,  the  gravitational  force,  and  the  azimuthal  acceleration  (tu  x r')  terms 
are  ignored.  The  external  force  per  unit  volume  equals  the  pressure  gradient  —VP  while  u>  is  the  rotation 
vector  of  the  earth. 

In  fluid  flow,  the  Rossby  number  Ro  is  defined  to  be 


Ro  = 


inertial  force 
Coriolis  force 


2 u)  x v” 


(10.69) 


For  large  dimensional  pressure  systems  in  the  atmosphere,  e.g.  L ~ lOOOfcm,  the  Rossby  number  is  Ro  ~ 0.1 
and  thus  the  Coriolis  force  dominates  and  the  radial  acceleration  can  be  neglected.  This  leads  to  a flow 
velocity  v ~ 10 m/s  which  is  perpendicular  to  the  pressure  gradient  VP,  that  is,  the  air  flows  horizontally 
parallel  to  the  isobars  of  constant  pressure  which  is  called  geostrophic  flow.  For  much  smaller  dimension 
systems,  such  as  at  the  wall  of  a hurricane,  L ~ 50 km,  and  v ~ 50 m/s,  the  Rossby  number  Ro  ~ 10  and 
the  Coriolis  effect  plays  a much  less  significant  role  compared  to  the  balance  between  the  radial  centrifugal 
forces  and  the  pressure  gradient.  The  same  situation  of  the  Coriolis  forces  being  insignificant  occurs  for  most 
small-scale  vortices  such  as  tornadoes,  typical  thermal  vortices  in  the  atmosphere,  and  for  water  draining  a 
bath  tub. 


10.12.1  Low-pressure  systems: 

It  is  interesting  to  analyze  the  motion  of  air  circulat- 
ing around  a low  pressure  region  at  large  radii  where 
the  motion  is  tangential.  As  shown  in  figure  10.9, 
a parcel  of  air  circulating  anticlockwise  around  the 
low  with  velocity  v involves  a pressure  difference  AP 
acting  on  the  surface  area  S,  plus  the  centrifugal  and 
Coriolis  forces.  Assuming  that  these  forces  are  bal- 
anced such  that  a”  0,  then  equation  10.68  simpli- 
fies to 

v 2 1 

— = -VP  — 2uwsinA  (10.70) 

r p 

where  the  latitude  A = 7 t — Q.  Thus  the  force  equation 
can  be  written 

1 dP  v2 

= h2ucusinA  (10.71) 

p dr  r 


It  is  apparent  that  the  combined  outward  Coriolis  Figure  10.8:  Air  flow  and  pressures  around  a low- 
force  plus  outward  centrifugal  force,  acting  on  the  pressure  region, 
circulating  air,  can  support  a large  pressure  gradient. 

The  tangential  velocity  v can  be  obtained  by  solving  this  equation  to  give 


/.  . ,,2  r dP 

t)=  t rwsinA  H — 

V P dr 


rw  sin  A 


(10.72) 


Note  that  the  velocity  equals  zero  when  r = 0 assuming  that  df-  is  finite.  That  is,  the  velocity  reaches  a 


maximum  at  a radius 


_ln  1 dP  ^ 
rpeakvei~  + puj sin X dr  > 


(10.73) 
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Figure  10.9:  Hurricane  Katrina  over  the  Gulf  of  Mexico  on  28  August  2005.  [Published  by  the  NOAA] 


which  occurs  at  the  wall  of  the  eye  of  the  circulating  low-pressure  system. 

Low  pressure  regions  are  produced  by  heating  of  air  causing  it  to  rise  and  resulting  in  an  inflow  of  air 
to  replace  the  rising  air.  Hurricanes  form  over  warm  water  when  the  temperature  exceeds  26°C'  and  the 
moisture  levels  are  above  average.  They  are  created  at  latitudes  between  10°  — 15°  where  the  sea  is  warmest, 
but  not  closer  to  the  equator  where  the  Coriolis  force  drops  to  zero.  About  90%  of  the  heating  of  the  air  comes 
from  the  latent  heat  of  vaporization  due  to  the  rising  warm  moist  air  condensing  into  water  droplets  in  the 
cloud  similar  to  what  occurs  in  thunderstorms.  For  hurricanes  in  the  northern  hemisphere,  the  air  circulates 
anticlockwise  inwards.  Near  the  wall  of  the  eye  of  the  hurricane,  the  air  rises  rapidly  to  high  altitudes  at 
which  it  then  flows  clockwise  and  outwards  and  subsequently  back  down  in  the  outer  reaches  of  the  hurricane. 
Both  the  wind  velocity  and  pressure  are  low  inside  the  eye  which  can  be  cloud  free.  The  strongest  winds 
are  in  vortex  surrounding  the  eye  of  the  hurricane,  while  weak  winds  exist  in  the  counter-rotating  vortex  of 
sinking  air  that  occurs  far  outside  the  hurricane. 

Figure  10.9  shows  the  satellite  picture  of  the  hurricane  Katrina,  recorded  on  28  August  2005.  The  eye  of 
the  hurricane  is  readily  apparent  in  this  picture.  The  central  pressure  was  90200A/m2  (902 mb)  compared 
with  the  standard  atmospheric  pressure  of  101300fV/m2  (1013?n6).  This  lllm&  pressure  difference  produced 
steady  winds  in  Katrina  of  280 km/hr  ( I75mph)  with  gusts  up  to  344 km/hr  which  resulted  in  1833  fatalities. 

Tornadoes  are  another  example  of  a vortex  low-pressure  system  that  are  the  opposite  extreme  in  both 
size  and  duration  compared  with  a hurricane.  Tornadoes  may  last  only  ~ 10  minutes  and  be  quite  small  in 
radius.  Pressure  drops  of  up  to  lOOmfe  have  been  recorded,  but  since  they  may  only  be  a few  100  meters  in 
diameter,  the  pressure  gradient  can  be  much  higher  than  for  hurricanes  leading  to  localized  winds  thought  to 
approach  500 km/ hr.  Unfortunately,  the  instrumentation  and  buildings  hit  by  a tornado  often  are  destroyed 
making  study  difficult.  Note  that  the  the  pressure  gradient  in  small  diameter  of  rope  tornadoes  is  much 
more  destructive  than  the  larger  1/4  mile  diameter  tornadoes,  resulting  in  much  higher  winds. 
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10.12.2  High-pressure  systems: 

In  contrast  to  low-pressure  systems,  high-pressure  systems  are  very  different  in  that  the  Coriolis  force  points 
inward  opposing  the  outward  pressure  gradient  and  centrifugal  force.  That  is, 


v2  1 (IP 

— = 2vco  sin  A — 

r p dr 


(10.74) 


which  gives  that 


v = ruj  sin  A — 


/ . .,2  rdP 

rwsmA) — 

p dr 


(10.75) 


This  implies  that  the  maximum  pressure  gradient  plus  centrifugal  force  supported  by  the  Coriolis  force  is 


— — < (rwsinA)2  (10.76) 

p dr 


As  a consequence,  high  pressure  regions  tend  to  have  weak  pressure  gradients  and  light  winds  in  contrast 
to  the  large  pressure  gradients  plus  concomitant  damaging  winds  possible  for  low  pressure  systems  such  a 
hurricanes  or  tornados. 

The  circulation  behavior,  exhibited  by  weather  patterns,  also  applies  to  ocean  currents  and  other  liquid 
flow  on  earth.  However,  the  residual  angular  momentum  of  the  liquid  often  can  overcome  the  Coriolis  terms. 
Thus  often  it  will  be  found  experimentally  that  water  exiting  the  bathtub  does  not  circulate  anticlockwise  in 
the  northern  hemisphere  as  predicted  by  the  Coriolis  force.  This  is  because  it  was  not  stationary  originally, 
but  rotating  slowly. 

Reliable  prediction  of  weather  is  an  extremely  difficult,  complicated  and  challenging  task,  which  is  of  con- 
siderable importance  in  modern  life.  As  discussed  in  chapter  15.8,  fluid  flow  can  be  much  more  complicated 
than  assumed  in  this  discussion  of  air  flow  and  weather.  Both  turbulent  and  laminar  flow  are  possible.  As  a 
consequence,  computer  simulations  of  weather  phenomena  are  difficult  because  the  air  flow  can  be  turbulent 
and  the  transition  from  order  to  chaotic  flow  is  very  sensitive  to  the  initial  conditions.  Typically  the  air 
flow  can  involve  both  macroscopic  ordered  coherent  structures  over  a wide  dynamic  range  of  dimensions, 
coexisting  with  chaotic  regions.  Computer  simulations  of  fluid  flow  often  are  performed  based  on  Lagrangian 
mechanics  to  exploit  the  scalar  properties  of  the  Lagrangian.  Ordered  coherent  structures,  ranging  from 
microscopic  bubbles  to  hurricanes,  can  be  recognized  by  exploiting  Lyapunov  exponents  to  identify  the  or- 
dered motion  buried  in  the  underlying  chaos.  Thus  the  techniques  discussed  in  classical  mechanics  are  of 
considerable  importance  outside  of  physics. 


10.13  Foucault  pendulum 


A classic  example  of  motion  in  non-inertial  frames  is  the  rotation 
of  the  Foucault  pendulum  on  the  surface  of  the  earth.  The  Fou- 
cault pendulum  is  a spherical  pendulum  with  a long  suspension 
that  oscillates  in  the  x — y plane  with  sufficiently  small  ampli- 
tude that  the  vertical  velocity  z is  negligible.  Assume  that  the 
pendulum  is  a simple  pendulum  of  length  l and  mass  m as  shown 
in  figure  10.10.  The  equation  of  motion  is  given  by 

r = g H 2F2  x r (10.77) 

m 

where  ^ is  the  acceleration  produced  by  the  tension  in  the  pen- 
dulum suspension  and  the  rotation  vector  of  the  earth  is  des- 
ignated by  fl  to  avoid  confusion  with  the  oscillation  frequency 
of  the  pendulum  to.  The  effective  gravitational  acceleration  g is 
given  by 


g = g0  - ft  x [FI  x (r  + R)] 


(10.78) 


Figure  10.10:  Foucault  pendulum. 
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that  is,  the  true  gravitational  field  go  corrected  for  the  centrifugal 
force. 

Assume  the  small  angle  approximation  for  the  deflection  angle  of  the  pendulum  /3,  then  Tz  = T cos/3  ~ T 
and  Tz  = mg,  thus  T ~ mg.  Then  has  shown  in  figure  fO.10,  the  horizontal  components  of  the  restoring 
force  are 

Tx  = ^mgj  (10.79) 

Ty  = - m.gj  (10.80) 

Since  g is  vertical,  and  neglecting  terms  involving  z,  then  evaluating  the  cross  product  in  equation  (10.78) 
simplifies  to 


X 

x = — g—  + 2yLlcos9 

y = — gj  — 2xflcos8 

where  0 is  the  colatitude  which  is  related  to  the  latitude  A by 

cos  9 = sin  A 

The  natural  angular  frequency  of  the  simple  pendulum  is 

wo  = 

while  the  z component  of  the  earth’s  angular  velocity  is 

f lz  = Cl  cos  6 


(10.81) 

(10.82) 

(10.83) 

(10.84) 

(10.85) 


Thus  equations  10.81  and  10.82  can  be  written  as 

x — 2fL  y + ui^x  = 0 

y — 2Qzx  + Wq2/  = 0 (10.86) 

These  are  two  coupled  equations  that  can  be  solved  by  making  a coordinate  transformation. 

Define  a new  coordinate  that  is  a complex  number 


r]  = x + iy  (10.87) 

Multiply  the  second  of  the  coupled  equations  10.86  by  i and  add  to  the  first  equation  gives 

(x  + iy)  — 2flz  (y  — ix)  + Uq(x  + iy)  = (10.88) 

(x  + iy)  + 2iQz  (x  + iy)  + Wq  {x  + iy)  = 0 

which  can  be  written  as  a differential  equation  for  y 

y + 2iQzr)  + co20y  = 0 (10.89) 

Note  that  the  complex  number  y contains  the  same  information  regarding  the  position  in  the  x — y plane 
as  equations  10.86.  The  plot  of  y in  the  complex  plane,  the  Argand  diagram,  is  a birds-eye  view  of  the 
position  coordinates  ( x , y)  of  the  pendulum.  This  second-order  homogeneous  differential  equation  has  two 
independent  solutions  that  can  be  derived  by  guessing  a solution  of  the  form 

yit)  = Ae~iat 


Substituting  equation  10.90  into  10.89  gives 

a2  - 2Qza  - lu2  = 0 


(10.90) 
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Therefore 

a = nz±  sjn2z  + ul  (10.91) 

Assume  that  the  angular  velocity  of  the  pendulum  wo  is  very  much  higher  than  the  angular  velocity  of 
the  earth,  i.e.  wq  >>  then 


a-fLiwo  (10.92) 

Thus  the  solution  is  of  the  form 

77(f)  = e-iQ^(A+e^ot  + A_eiuot)  (10.93) 

This  can  be  written  as 

77(f)  = Ae~lQ,zt  cos(uiot  + S)  (10.94) 

where  the  phase  S and  amplitude  A depend  on  the  initial  conditions.  Thus  the  plane  of  oscillation  of  the 
pendulum  is  defined  by  the  ratio  of  the  x and  y coordinates,  that  is  the  phase  angle  iClzt.  This  phase  angle 
rotates  with  angular  velocity  f 2-  where 


flz  = f2cos$  = 12  sin  A (10.95) 

At  the  north  pole  the  earth  rotates  under  the  pendulum  with  angular  velocity  12  and  the  axis  of  the 
pendulum  is  fixed  in  an  inertial  frame  of  reference.  At  lower  latitudes,  the  pendulum  precesses  at  the  lower 
angular  frequency  12  z = 12  sin  A that  goes  to  zero  at  the  equator.  For  example,  in  Rochester,  NY,  A = 43° N, 
and  therefore  a Foucault  pendulum  precesses  at  12z  = 0.68212.  That  is,  the  pendulum  precesses  245.5°/day. 


10.14  Summary 

This  chapter  has  focussed  on  describing  motion  in  non-inertial  frames  of  reference.  It  has  been  shown  that 
the  force  and  acceleration  in  non-inertial  frames  can  be  related  using  either  Newtonian  and  Lagrangian 
mechanics  by  introducing  additional  inertial  forces  in  the  non-inertial  reference  frame. 

Translational  acceleration  of  a reference  frame  In  a primed  frame,  that  is  undergoing  translational 
acceleration  A,  the  motion  in  this  non-inertial  frame  can  be  calculated  by  addition  of  an  inertial  force  -mA, 
that  leads  to  an  equation  of  motion 

maf  = F - mA  (10.6) 

Note  that  the  primed  frame  is  an  inertial  frame  if  A = 0. 


Rotating  reference  frame  It  was  shown  that  the  time  derivatives  of  a general  vector  G in  both  an 
inertial  frame  and  a rotating  reference  frame  are  related  by 


dG\ 

) fixed 


rotating 


-f-  CJ  X G 


(10.16) 


where  the  u>  x G term  originates  from  the  fact  that  the  unit  vectors  in  the  rotating  reference  frame  are  time 
dependent  with  respect  to  the  inertial  frame. 


Reference  frame  undergoing  both  rotation  and  translation  Both  Newtonian  and  Lagrangian  me- 
chanics were  used  to  show  that  for  the  case  of  translational  acceleration  plus  rotation,  the  effective  force  in 
the  non-inertial  (double-primed)  frame  can  be  written  as 

F eff  = ma"  = F-  ra(A  + wxV-|-  2a;  x v"  + u>  x (uxr)+ux  r')  (10.28, 10.36) 

These  inertial  correction  forces  result  from  describing  the  system  in  a non-inertial  frame.  These  inertial 
forces  are  felt  when  in  the  rotating-translating  frame  of  reference.  Thus  the  notion  of  these  inertial  forces 
can  be  very  useful  for  solving  problems  in  non-inertial  frames.  For  the  case  of  rotating  frames,  two  important 
inertial  forces  are  the  centrifugal  force,  —a;  x (a;  x r') , and  the  Coriolis  force  —2a;  x v". 
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Routhian  reduction  for  rotating  systems  It  was  shown  that  for  non-inertial  systems,  identical  equa- 
tions of  motion  are  derived  using  Newtonian,  Lagrangian,  Hamiltonian,  and  Routhian  mechanics. 

Terrestrial  manifestations  of  rotation  Examples  of  motion  in  rotating  frames  presented  in  the  chapter 
included  projectile  motion  with  respect  to  the  surface  of  the  Earth,  rotation  alignment  of  nucleons  in  rotating 
nuclei,  and  weather  phenomena. 


Workshop  exercises 

1.  Consider  a fixed  reference  frame  S and  a rotating  frame  S' . The  origins  of  the  two  coordinate  systems  always 
coincide.  By  carefully  drawing  a diagram,  derive  an  expression  relating  the  coordinates  of  a point  P in  the  two 
systems.  (This  was  covered  in  Chapter  2,  but  it  is  worth  reviewing  now. 

2.  The  effective  force  observed  in  a rotating  coordinate  system  is  given  by  equation  10.28. 

(a)  What  is  the  significance  of  each  term  in  this  expression? 

(b)  Suppose  you  wanted  to  measure  the  gravitational  force,  both  magnitude  and  direction,  on  a body  of  mass 
7n  at  rest  on  the  surface  of  the  Earth.  What  terms  in  the  effective  force  can  be  neglected? 

(c)  Suppose  you  wanted  to  calculate  the  deflection  of  a projectile  fired  horizontally  along  the  Earth’s  surface. 
What  terms  in  the  effective  force  can  be  neglected? 

(d)  Suppose  you  wanted  to  calculate  the  effective  force  on  a small  block  of  mass  m placed  on  a frictionless 
turntable  rotating  with  a time-dependent  angular  velocity  u>(t).  What  terms  in  the  effective  force  can  be 
neglected? 

3.  A plumb  line  is  carried  along  in  a moving  train,  with  m the  mass  of  the  plumb  bob.  Neglect  any  effects  due  to 
the  rotation  of  the  Earth  and  work  in  the  noninertial  frame  of  reference  of  the  train. 

(a)  Find  the  tension  in  the  cord  and  the  deflection  from  the  local  vertical  if  the  train  is  moving  with  constant 
acceleration  do- 

(b)  Find  the  tension  in  the  cord  and  the  deflection  from  the  local  vertical  if  the  train  is  rounding  a curve  of 
radius  p with  constant  speed  Vq. 

4.  A bead  on  a rotating  rod  is  free  to  slide  without  friction.  The  rod  has  a length  L and  rotates  about  its  end 
with  angular  velocity  LO.  The  bead  is  initially  released  from  rest  (relative  to  the  rod)  at  the  midpoint  of  the 
rod. 

(a)  Find  the  displacement  of  the  bead  along  the  wire  as  a function  of  time. 

(b)  Find  the  time  when  the  bead  leaves  the  end  of  the  rod. 

(c)  Find  the  velocity  (relative  to  the  rod)  of  the  bead  when  it  leaves  the  end  of  the  rod. 

5.  Here  is  a “thought  experiment”  for  you  to  consider.  Suppose  you  are  in  a small  sailboat  of  mass  M at  the 
Earth’s  equator.  At  the  equator  there  is  very  little  wind  (this  is  known  as  the  “equatorial  doldrums”),  so  your 
sailboat  is,  more  or  less,  sitting  still.  You  have  a small  anchor  of  mass  m on  deck  and  a single  mast  of  height 
h in  the  middle  of  the  boat.  How  can  you  use  the  anchor  to  put  the  boat  into  motion?  In  which  direction  will 
the  boat  move? 

6.  Does  water  really  flow  in  the  other  direction  when  you  flush  a toilet  in  the  southern  hemisphere?  What  (if 
anything)  does  the  Coriolis  force  have  to  do  with  this? 

7.  We  are  presently  at  a latitude  A (with  respect  to  the  equator)  and  Earth  is  rotating  with  constant  angular 
velocity  CO.  Consider  the  following  two  scenarios:  Scenario  A:  A particle  is  thrown  upward  with  initial  speed 
Vo-  Scenario  B:  An  identical  particle  is  dropped  (at  rest)  from  the  maximum  height  of  the  particle  in  Scenario 
A.  Circle  all  the  true  statements  regarding  the  Coriolis  deflection  assuming  that  the  particles  have  landed  for 
a)  and  b),  . 
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(a)  The  magnitude  is  greater  in  A than  in  B. 

(b)  The  direction  in  A and  B are  the  same. 

(c)  The  direction  in  A does  not  change  throughout  flight. 


Problems 


l. 


If  a projectile  is  fired  due  east  from  a point  on  the  surface  of  the  Earth  at  a northern  latitude  A with  a velocity 
of  magnitude  Vo  and  at  an  inclination  to  the  horizontal  of  a,  show  that  the  lateral  deflection  when  the  projectile 
strikes  the  Earth  is 


4Vn3  2 

d = — sin  A sin  a cos  a 

r 


where  u>  is  the  rotation  frequency  of  the  Earth. 


2.  In  the  preceding  problem,  if  the  range  of  the  projectile  is  R'0  for  the  case  w = 0,  show  that  the  change  of  range 
due  to  rotation  of  the  Earth  is 


A R’  = 


l2R$ 

— —uj  cos  A 


1 1 3 

cot2  a — - tan2  a 

o 


3.  Obtain  an  expression  for  the  angular  deviation  of  a particle  projected  from  the  North  Pole  in  a path  that  lies 
close  to  the  surface  of  the  earth.  Is  the  deviation  significant  for  a missile  that  makes  a 4800-km  flight  in  10 
minutes?  What  is  the  ’’miss  distance”  if  the  missile  is  aimed  directly  at  the  target?  Is  the  miss  difference 
greater  for  a 19300-km  flight  at  the  same  velocity? 


Chapter  11 


Rigid-body  rotation 


11.1  Introduction 

Rigid-body  rotation  features  prominently  in  science,  engineering,  and  sports.  Prior  chapters  have  focussed 
primarily  on  motion  of  point  particles.  This  chapter  extends  the  discussion  to  motion  of  finite-sized  rigid 
bodies.  A rigid  body  is  a collection  of  particles  where  the  relative  separations  remain  rigidly  fixed.  In  real 
life,  there  is  always  some  motion  between  individual  atoms,  but  usually  this  microscopic  motion  can  be 
neglected  when  describing  macroscopic  properties.  Note  that  the  concept  of  perfect  rigidity  has  limitations 
in  the  theory  of  relativity  since  information  cannot  travel  faster  than  the  velocity  of  light,  and  thus  signals 
cannot  be  transmitted  instantaneously  between  the  ends  of  a rigid  body  which  is  implied  if  the  body  had 
perfect  rigidity. 

The  description  of  rigid-body  rotation  is  most  easily  handled  by  specifying  the  properties  of  the  body 
in  the  rotating  body-fixed  coordinate  frame  whereas  the  observables  are  measured  in  the  stationary  iner- 
tial laboratory  coordinate  frame.  In  the  body-fixed  coordinate  frame,  the  primary  observable  for  classical 
mechanics  is  the  inertia  tensor  of  the  rigid  body  which  is  well  defined  and  independent  of  the  rotational 
motion.  By  contrast,  in  the  stationary  inertial  frame  the  observables  depend  sensitively  on  the  details  of 
the  rotational  motion.  For  example,  when  observed  in  the  stationary  fixed  frame,  rapid  rotation  of  a pencil 
about  the  longitudinal  symmetry  axis  gives  a time-averaged  shape  of  the  pencil  that  looks  like  a thin  cylin- 
der, whereas  the  time-averaged  shape  is  a flat  disk  for  rotation  perpendicular  to  the  symmetry  axis  of  the 
pencil.  In  spite  of  this,  the  pencil  always  has  the  same  unique  inertia  tensor  in  the  body-fixed  frame.  Thus 
the  best  solution  for  describing  rotation  of  a rigid  body  is  to  use  a rotation  matrix  that  transforms  from 
the  stationary  fixed  frame  to  an  instantaneous  body-fixed  frame  for  which  the  moment  of  inertia  tensor  can 
be  evaluated.  Moreover,  the  problem  can  be  greatly  simplified  by  transforming  to  a body-fixed  coordinate 
frame  that  is  aligned  with  any  symmetry  axes  of  the  body  since  then  the  inertia  tensor  can  be  diagonal;  this 
is  called  a principal  axis  system. 

Rigid-body  rotation  can  be  broken  into  the  following  two  classifications. 

1)  Rotation  about  a fixed  axis: 

A body  can  be  constrained  to  rotate  about  an  axis  that  has  a fixed  location  and  orientation  relative  to 
the  body.  The  hinged  door  is  a typical  example.  Rotation  about  a fixed  axis  is  straightforward  since  the 
axis  of  rotation,  plus  the  moment  of  inertia  about  this  axis,  are  well  defined  and  this  case  was  discussed  in 
chapter  2.12.7. 

2)  Rotation  about  a point 

A body  can  be  constrained  to  rotate  about  a fixed  point  of  the  body  but  the  orientation  of  this  rotation 
axis  about  this  point  is  unconstrained.  One  example  is  rotation  of  an  object  flying  freely  in  space  which  can 
rotate  about  the  center  of  mass  with  any  orientation.  Another  example  is  a child’s  spinning  top  which  has 
one  point  constrained  to  touch  the  ground  but  the  orientation  of  the  rotation  axis  is  undefined. 

The  prior  discussion  in  chapter  2.12.7  showed  that  rigid-body  rotation  is  more  complicated  than  assumed 
in  introductory  treatments  of  rigid-body  rotation.  It  is  necessary  to  expand  the  concept  of  moment  of  inertia 
to  the  concept  of  the  inertia  tensor,  plus  the  fact  that  the  angular  momentum  may  not  point  along  the 
rotation  axis.  The  most  general  case  requires  consideration  of  rotation  about  a body-fixed  point  where  the 
orientation  of  the  axis  of  rotation  is  unconstrained.  The  concept  of  the  inertia  tensor  of  a rotating  body  is 
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crucial  for  describing  rigid-body  motion.  It  will  be  shown  that  working  in  the  body-fixed  coordinate  frame  of 
a rotating  body  allows  a description  of  the  equations  of  motion  in  terms  of  the  inertia  tensor  for  a given  point 
of  the  body,  and  that  it  is  possible  to  rotate  the  body-fixed  coordinate  system  into  a principal  axis  system 
where  the  inertia  tensor  is  diagonal.  For  any  principal  axis,  the  angular  momentum  is  parallel  to  the  angular 
velocity  if  it  is  aligned  with  a principal  axis.  The  use  of  a principal  axis  system  greatly  simplifies  treatment 
of  rigid-body  rotation  and  exploits  the  powerful  and  elegant  matrix  algebra  mentioned  in  appendix  A. 

The  following  discussion  of  rigid-body  rotation  is  broken  into  three  topics,  (1)  the  inertia  tensor  of  the 
rigid  body,  (2)  the  transformation  between  the  rotating  body-fixed  coordinate  system  and  the  laboratory 
frame,  i.e.,  the  Euler  angles  specifying  the  orientation  of  the  body- fixed  coordinate  frame  with  respect  to  the 
laboratory  frame,  and  (3)  Lagrange  and  Euler’s  equations  of  motion  for  rigid-bodies.  This  is  followed  by  a 
discussion  of  practical  applications. 


11.2  Rigid-body  coordinates 

Motion  of  a rigid  body  is  a special  case  for  motion  of  the  TV-body  system  when  the  relative  positions  of 
the  N bodies  are  related.  It  was  shown  in  chapter  2 that  the  motion  of  a rigid  body  can  be  broken  into 

a combination  of  a linear  translation  of  some  point  in  the  body,  plus  rotation  of  the  body  about  an  axis 

through  that  point.  This  is  called  Chasles’  Theorem.  Thus  the  position  of  every  particle  in  the  rigid  body 
is  fixed  with  respect  to  one  point  in  the  body.  If  the  fixed  point  of  the  body  is  chosen  to  be  the  center  of 
mass,  then,  as  discussed  in  chapter  2,  it  is  possible  to  separate  the  kinetic  energy,  linear  momentum,  and 
angular  momentum  into  the  center-of-mass  motion,  plus  the  motion  about  the  center  of  mass.  Thus  the 
behavior  of  the  body  can  be  described  completely  using  only  six  independent  coordinates  governed  by  six 
equations  of  motion,  three  for  translation  and  three  for  rotation. 

Referred  to  an  inertial  frame,  the  translational  motion  of  the  center  of  mass  is  governed  by 

= ^ (11.1) 

dt  y ’ 

while  the  rotational  motion  about  the  center  of  mass  is  determined  by 

Ne  = — (11.2) 

dt 

where  the  external  force  F£  and  external  torque  NE  are  identified  separately  from  the  internal  forces  acting 
between  the  particles  in  the  rigid  body.  It  will  be  assumed  that  the  internal  forces  are  central  and  thus  do 
not  contribute  to  the  angular  momentum. 

The  location  of  any  fixed  point  in  the  body,  such  as  the  center  of  mass,  can  be  specified  by  three 
generalized  cartesian  coordinates  with  respect  to  a fixed  frame.  The  rotation  of  the  body-fixed  axis  system 
about  this  fixed  point  in  the  body  can  be  described  in  terms  of  three  independent  angles  with  respect  to  the 
fixed  frame.  There  are  several  possible  sets  of  orthogonal  angles  that  can  be  used  to  describe  the  rotation. 
This  book  uses  the  Euler  angles  (p,  9,  ip  which  correspond  first  to  a rotation  cp  about  the  z-axis,  then  a rotation 
9 about  the  x axis  following  the  first  rotation,  and  finally  a rotation  ip  about  the  new  z axis  following  the 
first  two  rotations.  The  Euler  angles  will  be  discussed  in  detail  following  introduction  of  the  inertia  tensor 
and  angular  momentum. 


11.3  Rigid-body  rotation  about  a body- fixed  point 

With  respect  to  some  point  O fixed  in  the  body  coordinate  system,  the  angular  momentum  of  the  body  a is 
given  by 

n n 

L = Lj  = r,;  x p,;  (11.3) 

i i 

There  are  two  especially  convenient  choices  for  the  fixed  point  O.  If  no  point  in  the  body  is  fixed  with 
respect  to  an  inertial  coordinate  system,  then  it  is  best  to  choose  O as  the  center  of  mass.  If  one  point  of 
the  body  is  fixed  with  respect  to  a fixed  inertial  coordinate  system,  such  as  a point  on  the  ground  where  a 
child’s  spinning  top  touches,  then  it  is  best  to  choose  this  stationary  point  as  the  body-fixed  point  O. 


11.3.  RIGID-BODY  ROTATION  ABOUT  A BODY-FIXED  POINT 


293 


Consider  a rigid  body  composed  of  N particles  of  mass 
ma  where  a = 1, 2, 3,  ...N.  As  discussed  in  chapter  10.4,  if  the 
body  rotates  with  an  instantaneous  angular  velocity  u>  about 
some  fixed  point,  with  respect  to  the  body- fixed  coordinate 
system,  and  this  point  has  an  instantaneous  translational  ve- 
locity V with  respect  to  the  fixed  (inertial)  coordinate  system, 
see  figure  11.1,  then  the  instantaneous  velocity  va  of  the  ath 
particle  in  the  fixed  frame  of  reference  is  given  by 

va  = V-fv"+uxr'a  (11.4) 

However,  for  a rigid  body,  the  velocity  of  a body-fixed  point 
with  respect  to  the  body  is  zero,  that  is  v"  = 0,  thus 

va  = V + wx4  (11.5) 

Consider  the  translational  velocity  of  the  body-fixed  point 
O to  be  zero,  i.e.  V = 0 and  let  R = 0,  then  rQ  = r'a  . These 
assumptions  allow  the  linear  momentum  of  the  particle  a to 
be  written  as 


Pa  = = mau)  x r, 


(11.6) 


Therefore 

N 

L = ^r„xptt  = 

a 

Using  the  vector  identity 


N 

^\nara  x (u>  x ra) 

a 


(11.7) 


Figure  11.1:  Infinitessimal  displacement  dr' , 
in  the  primed  frame,  broken  into  a part  drR 
due  to  rotation  of  the  primed  frame  plus  a 
part  d?’"due  to  displacement  with  respect  to 
this  rotating  frame. 


A x (B  x A)  = A2 B - A (A  ■ B) 


leads  to 


N 

L = ^2ma  [r2au  - ra  (ra  ■ u>)] 

a 


(11.8) 


The  angular  momentum  can  be  expressed  in  terms  of  components  of  u>  and  r'a  relative  to  the  body-fixed 
frame.  The  following  formulae  can  be  written  more  compactly  if  ra  = ( xa , ya,  za),  in  the  rotating  body-fixed 
frame,  is  written  in  the  form  ra  = {xa,i,xat2,xa^)  where  the  axes  are  defined  by  the  numbers  1,2,3  rather 
than  x,  y,  z.  In  this  notation,  the  angular  momentum  is  written  in  component  form  as 


N 

( V 

Lj  = ^ m„ 

y ^ xa,k  xa,i 

( T xa,jUJj  1 

oc 

k 

V i J\ 

Assume  the  Kronecker  delta  relation 


where 


3 

cuj  = 

j 


Sij  — 1 
% = 0 


i=j 


Substitute  (11.10)  in  (11.9)  gives 


N 


Li  = 


to 


xa,t. 


= £ 


OJj 


3 L 

N 


(11.9) 


(11.10) 


(11.11) 
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11.4  Inertia  tensor 


The  square  bracket  term  in  (11.11)  is  called  the  moment  of  inertia  tensor  I which  is  usually  referred  to 

as  the  inertia  tensor 


N 


ma 


(11.12) 


In  most  cases  it  is  more  useful  to  express  the  components  of  the  inertia  tensor  in  an  integral  form  over 
the  mass  distribution  rather  than  a summation  for  N discrete  bodies.  That  is, 


hi  = j P(r')  ^E**^ 


XiXj  dV 


(11.13) 


The  inertia  tensor  is  easier  to  understand  when  written  in  cartesian  coordinates  za)  rather 

than  in  the  form  r^,  = (xa,i,  xa^).  Then,  the  diagonal  moments  of  inertia  of  the  inertia  tensor  are 


N 


N 


I xx  — 


lyy  — 


1 7.7.  — 


E ixl + vl + zl  - xl]  = E hi  + zll 

a ot 

N N 

E TOa  hi  + yl  + zl  - vl]  = E m“  hi  + zl] 

a a 

N N 

E ma  [x2a  + yl  + zl  - z2a]  = E [xl  + vl] 


(11.14) 


N 


while  the  off-diagonal  products  of  inertia  are 

lyx  ^xy  = ^ TYIq.  [*^q;2/q;] 

a 
N 

I zx  — Ixz  = ^ ^ TTLoi 

a. 

N 

Izy  = lyz  = ^ ^ ma.  [jja ^a\ 

a 

Note  that  the  products  of  inertia  are  symmetric  in  that 

A?  = -A* 

The  above  notation  for  the  inertia  tensor  allows  the  angular  momentum  (11.12)  to  be  written  as 

3 

■v 

HjWj 


(11.15) 


u = e L- 


(11.16) 


(11.17) 


Expanded  in  cartesian  coordinates 


■^x  Ixx^x  T d-xy^y  T Ixz^z  (11.18) 

Ly  IyXCjJX  T IyyUJy  T IyZUJZ 

Lz  = Izx^x  + IzyUy  + Izz^z 

Note  that  every  fixed  point  in  a body  has  a specific  inertia  tensor.  The  components  of  the  inertia  tensor 
at  a specified  point  depend  on  the  orientation  of  the  coordinate  frame  whose  origin  is  located  at  the  specified 
fixed  point.  For  example,  the  inertia  tensor  for  a cube  is  very  different  when  the  fixed  point  is  at  the  center 
of  mass  compared  with  when  the  fixed  point  is  at  a corner  of  the  cube. 
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11.5  Matrix  and  tensor  formulations  of  rigid-body  rotation 


The  above  notation  is  clumsy  and  can  be  streamlined  by  use  of  matrix  methods.  Write  the  inertia  tensor  in 
a matrix  form  as 


m= 


In 

I\ 2 

A 3 \ 

hi 

I22 

hi 

(11.19) 

hi 

I32 

hi  / 

The  angular  velocity  and  angular  momentum  both  can  be  written  as  a column  vectors,  that  is 


/ u i 
W = li>2 
\ ^3 


(11.20) 


As  discussed  in  appendix  E 2,  equation  (11.18)  now  can  be  written  in  tensor  notation  as  an  inner  product 
of  the  form 

L = {1}  • u)  (11.21) 

Note  that  the  above  notation  uses  boldface  for  the  inertia  tensor  I,  implying  a rank-2  tensor  representation, 
while  the  angular  velocity  u>  and  the  angular  momentum  L are  written  as  column  vectors.  The  inertia  tensor 
is  a 9-conrponent  rank-2  tensor  defined  as  the  ratio  of  the  angular  momentum  vector  L and  the  angular 
velocity  u). 

{1}  = - (11.22) 

uj 

Note  that,  as  described  in  appendix  E,  the  inner  product  of  a vector  u>,  which  is  the  rank  1 tensor,  and  a 
rank  2 tensor  {1}  , leads  to  the  vector  L.  This  compact  notation  exploits  the  fact  that  the  matrix  and  tensor 
representation  are  completely  equivalent,  and  are  ideally  suited  to  the  description  of  rigid-body  rotation. 


11.6  Principal  axis  system 

The  inertia  tensor  is  a real  symmetric  matrix  because  of  the  symmetry  given  by  equation  (11.16) . A property 
of  real  symmetric  matrices  is  that  there  exists  an  orientation  of  the  coordinate  frame,  with  its  origin  at  the 
chosen  body- fixed  point  O,  such  that  the  inertia  tensor  is  diagonal.  The  coordinate  system  for  which  the 
inertia  tensor  is  diagonal  is  called  the  Principal  axis  system  which  has  three  perpendicular  principal 
axes.  Thus,  in  the  principal  axis  system,  the  inertia  tensor  has  the  form 

f hi  0 0 \ 

{I}=  0 1-22  o (11.23) 

\ o o hi  / 

where  Ijj  are  real  numbers,  which  are  called  the  principal  moments  of  inertia  of  the  body,  and  are 
usually  written  as  Ij . When  the  angular  velocity  vector  ui  points  along  any  principal  axis  unit  vector  J,  then 
the  angular  momentum  L is  parallel  to  and  the  magnitude  of  the  principal  moment  of  inertia  about  this 
principal  axis  is  given  by  the  relation 

LjJ  — IjUjj  (11.24) 

The  principal  axes  are  fixed  relative  to  the  shape  of  the  rigid  body  and  they  are  invariant  to  the  orientation 
of  the  body-fixed  coordinate  system  used  to  evaluate  the  inertia  tensor.  The  advantage  of  having  the  body- 
fixed  coordinate  frame  aligned  with  the  principal  axis  coordinate  frame  is  that  then  the  inertia  tensor  is 
diagonal,  which  greatly  simplifies  the  matrix  algebra.  Even  when  the  body-fixed  coordinate  system  is  not 
aligned  with  the  principal  axis  frame,  if  the  angular  velocity  is  specified  to  point  along  a principal  axis  then 
the  corresponding  moment  of  inertia  will  be  given  by  (11.24) . 

In  principle  it  is  possible  to  locate  the  principal  axes  by  varying  the  orientation  of  the  angular  velocity 
vector  u to  find  those  orientations  for  which  the  angular  momentum  L and  angular  velocity  u>  are  parallel 
which  characterizes  the  principal  axes.  However,  the  best  approach  is  to  diagonalize  the  inertia  tensor. 
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11.7  Diagonalize  the  inertia  tensor 


Finding  the  three  principal  axes  involves  diagonalizing  the  inertia  tensor,  which  is  the  classic  eigenvalue 
problem  discussed  in  appendix  A.  Solution  of  the  eigenvalue  problem  for  rigid-body  motion  corresponds  to 
a rotation  of  the  coordinate  frame  to  the  principal  axes  resulting  in  the  matrix 

{1}  ■ uj  = Iuj  (11.25) 


where  7 comprises  the  three-valued  eigenvalues,  while  the  corresponding  vector  u is  the  eigenvector.  Ap- 
pendix A. 4 gives  the  solution  of  the  matrix  relation 

{I}-u;  = 7{I}u;  (11.26) 


where  I are  three-valued  eigen  values  for  the  principal  axis  moments  of  inertia,  and  {1}  is  the  unity  tensor, 
equation  A. 2.4. 

10  0) 

0 1 0 l (11.27) 

0 0 1 J 


Rewriting  (11.26)  gives 


({I}-J{I})-a;  = 0 


(11.28) 


This  is  a matrix  equation  of  the  form  A • u>  =0  where  A is  a 3 x 3 matrix  and  u is  a vector  with  values 
ujx,(jjy,u)z.  The  matrix  equation  A ■ u =0  really  corresponds  to  three  simultaneous  equations  for  the  three 
numbers  u>x,u>y,u)z.  It  is  a well-known  property  of  equations  like  (11.28)  that  they  have  a non-zero  solution 
if,  and  only  if,  the  determinant  det(A)  is  zero,  that  is 


det(I— 7I)=0 


(11.29) 


This  is  called  the  characteristic  equation,  or  secular  equation  for  the  matrix  I.  The  determinant 
involved  is  a cubic  equation  in  the  value  of  7 that  gives  the  three  principal  moments  of  inertia.  Inserting 
one  of  the  three  values  of  7 into  equation  (11.17)  gives  the  corresponding  eigenvector  uj.  Applying  the  above 
eigenvalue  problem  to  rigid-body  rotation  corresponds  to  requiring  that  some  arbitrary  set  of  body-fixed 
axes  be  the  principal  axes  of  inertia.  This  is  obtained  by  rotating  the  body-fixed  axis  system  such  that 


Li  = Ii\u>i  + 712w2  + 713W3  = Ilo\  (11.30) 

L2  = I21M1  + 722^2  + 723W3  = Ii02 

7/3  = 731CU1  + I32OJ2  + 733W3  = I0J3 


or 


(7n  — 7)  CUi  + 7i2^2  + 713W3  — 0 
72iUq  + (722  — 7)  U>2  + 723W3  = 0 
731W1  + 732W2  + (733  — 7)  023  = 0 


(11.31) 


These  equations  have  a non-trivial  solution  for  the  ratios  uq  : ui 2 


: W3  since  the  determinant  vanishes,  that  is 


(hi  - 7)  712  713 

I21  (I22  ^ 7)  I23 

hi  I32  (I33  - 7) 


= 0 


(11.32) 


The  expansion  of  this  determinant  leads  to  a cubic  equation  with  three  roots  for  7.  This  is  the  secular 
equation  for  7 whose  eigenvalues  are  the  principal  moments  of  inertia. 

The  directions  of  the  principal  axes,  that  is  the  eigenvectors,  can  be  found  by  substituting  the  cor- 
responding solution  for  7 into  the  prior  equation.  Thus  for  eigensolution  7X  the  eigenvector  is  given  by 
solving 


(7n  — 7i)  u>u  + I12U21  + I13U31  = 0 

721^11  + (722  — h)  W21  + 723W31  = 0 

731CU11  + 732W21  + (733  — 7i)  0231  = 0 


(11.33) 
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These  equations  are  solved  for  the  ratios  wu  : u>2i  ■ W31  which  are  the  direction  numbers  of  the  principle  axis 
system  corresponding  to  solution  R . This  principal  axis  system  is  defined  relative  to  the  original  coordinate 
system.  This  procedure  is  repeated  to  find  the  orientation  of  the  other  two  mutually  perpendicular  principal 
axes. 


11.8  Parallel-axis  theorem 


The  values  of  the  components  of  the  inertia  tensor  depend  on  the  point 
and  the  orientation  about  which  the  body  rotates  relative  to  the  body- 
fixed  coordinate  system.  The  parallel-axis  theorem  is  valuable  for  relat- 
ing the  inertia  tensor  for  rotation  about  parallel  axes  passing  through 
different  points  fixed  with  respect  to  the  rigid  body.  For  example,  one 
may  wish  to  relate  the  inertia  tensor  through  the  center  of  mass  to  an- 
other location  that  may  be  constrained  to  remain  stationary,  like  the 
tip  of  the  spinning  top. 

Consider  the  mass  a at  the  location  r = (x\,  X2,  £3)  with  respect 
to  the  origin  of  the  center  of  mass  body-fixed  coordinate  system  O. 

Transform  to  an  arbitrary  but  parallel  body-fixed  coordinate  system 
Q , that  is,  the  coordinate  axes  have  the  same  orientation  as  the  center 
of  mass  coordinate  system.  The  location  of  the  mass  a with  respect 
to  this  arbitrary  coordinate  system  is  R = (X\ . X2,  XR.  That  is,  the 
general  vectors  for  the  two  coordinates  systems  are  related  by 

R = a + r (11.34) 

Figure  11.2:  Transformation  be- 

where  a is  the  vector  connecting  the  origins  of  the  coordinate  systems  tween  two  parallel  body-coordinate 
O and  Q illustrated  in  figure  11.2.  The  elements  of  the  inertia  tensor  systems,  O and  Q. 
with  respect  to  axis  system  Q , are  given  by  equation  11.12  to  be 


N 


Jij  ~ E 


m0 


EAE  - 


(11.35) 


The  components  along  the  three  axes  for  each  of  the  two  coordinate  systems  are  related  by 

Xi  d i Xi 

Substituting  these  into  the  above  inertia  tensor  relation  gives 


(11.36) 


N 


Jij 


= E 

a 

N 

= E 


mn 


ma 


ME  T ^i)  I T ^i)  (%ot,j  T ^i) 


E x2k  - 


3'Cz.i'Eo 


N 

E 


m0 


(11.37) 


& ij  I ^ ^ -j-  CLjz)  (<Q'i*Eot,j  H"  ^j^cx.,i  H-  Q’j ) 


The  first  summation  on  the  right-hand  side  corresponds  to  the  elements  Iij  of  the  inertia  tensor  in  the 
center-of-mass  frame.  Thus  the  terms  can  be  regrouped  to  give 


N 


Jij  — Iij  + I $ij  ak 


N 

E 


mn 


2 Sij  E 'E<y.,k,Qjk  CliXaj  ^j^a7i 


(11.38) 


However,  each  term  in  the  last  bracket  involves  a sum  of  the  form  maxa}k-  Take  the  coordinate  system 
O to  be  with  respect  to  the  center  of  mass  for  which 


N 


ETOq  r'  = 0 


(11.39) 
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This  also  applies  to  each  component  k.  that  is 


N 

Y.  maxa,k  = 0 


Therefore  all  of  the  terms  in  the  last  bracket  cancel  leaving 

N / 3 


J ij  — lij  ^ ^ Wla  I $ij  ^ ^ ^ k &i&j 


k 


But,  J2a  m“  = M and  J2k  al  = thus 

Jij  — lij  ‘ I AI  (a  Sij  0 1 (Ij  ) 


(11.40) 


(11.41) 


(11.42) 


where  Rj  is  the  center-of-mass  inertia  tensor.  This  is  the  general  form  of  Steiner’s  parallel-axis  theorem. 
As  an  example,  the  moment  of  inertia  around  the  X\  axis  is  given  by 

Ju  = In  T AI  ((fli  + al  + O3)  $11  — afj  = In  A AI  (a^  + ci 3)  (11.43) 

which  corresponds  to  the  elementary  statement  that  the  difference  in  the  moments  of  inertia  equals  the 
mass  of  the  body  multiplied  by  the  square  of  the  distance  between  the  parallel  axes,  x-\ , X-t . Note  that  the 
minimum  moment  of  inertia  of  a body  is  Rj  which  is  about  the  center  of  mass. 


11.1  Example:  Inertia  tensor  of  a solid  cube  rotating  about  the  center  of  mass. 


The  complicated  expressions  for  the  inertia  tensor  can  be  un- 
derstood using  the  example  of  a uniform  solid  cube  with  side  b, 
density  p,  and  mass  AI  = pb3,  rotating  about  different  axes.  As- 
sume that  the  origin  of  the  coordinate  system  O is  at  the  center 
of  mass  with  the  axes  perpendicidar  to  the  centers  of  the  faces  of 
the  cube. 

The  components  of  the  inertia  tensor  can  be  calculated  using 
(11.13)  written  as  an  integral  over  the  mass  distribution  rather 
than  a summation. 


Thus 

pb/ 2 pb/ 2 pb/ 2 

In  = p / / (x\  + xf)  dx^dx^dxi 

J-b/2  J-b/2  7-6/2 

= t pb 5 = ^AIb~  — 1-22  = I33 
6 6 


Inertia  tensor  of  a uniform  solid  cube  of 
side  b about  the  center  of  mass  O and  a 
corner  of  the  cube  Q.  The  vector  a is  the 
vector  distance  between  O and  Q. 


By  symmetry  the  diagonal  moments  of  inertia  about  each  face 
are  identical.  Similarly  the  products  of  inertia  are  given  by 


/b/2  pb/2  pb/2 

/ / (aqa^) dx^dx2dxi  = 0 

-b/2  J-b/2  J-b/2 


Thus  the  inertia  tensor  is  given  by 


j cm 


1 0 0 \ 

010 

001/ 


Note  that  this  inertia  tensor  is  diagonal  implying  that  this  is  the  principal  axis  system.  In  this  case  all  three 
principal  moments  of  inertia  are  identical  and  perpendicular  to  the  centers  of  the  faces  of  the  cube.  This  is 
as  expected  from  the  symmetry  of  the  cubic  geometry. 
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11.2  Example:  Inertia  tensor  of  about  a corner  of  a solid  cube. 


a)  Direct  calculation  Let  one  corner  of  the  cube  be  the  origin  of  the  coordinate  system  Q and  assume 
that  the  three  adjacent  sides  of  the  cube  lie  along  the  coordinate  axes.  The  components  of  the  inertia  tensor 
can  be  calculated  using  (11.13) . Thus 


mb  2 2 

(x\  +£3)  dx2,dx2dx\  = -pbb  = - Mb 2 
3 3 


I n = 


Jo  JO  JO 

pb  pb  pb 


I\2  — ~P 


■JO  JO  Jo 

Thus,  evaluating  all  the  nine  components  gives 

1 


{x\x2)  dx^dx2dx\  = ~~^pb5  = — ^ Mb 2 


1corner  = _Mb2  ^3 

12 


3 -3 
i -3 
-3  -3  8 


b)  Parallel-axis  theorem  This  inertia  tensor  also  can  be  calculated  using  the  parallel-axis  theorem  to 
relate  the  moment  of  inertia  about  the  corner,  to  that  at  the  center  of  mass.  As  shown  in  the  figure,  the 
vector  a has  components 

b 

ai=  a2  = a3  = - 

Applying  the  parallel-axis  theorem  gives 

Ju  = hi  + M ( a 2 - al)  = In  + M (a|  + a§)  = ^Mb2  + ]-Mb 2 = \ Mb 2 

D Zi  O 

and  similarly  for  J22  and  J33.  The  off-diagonal  terms  are  given  by 

•/|2  = I12  + M (—aia2)  = — — Mb 2 

Thus  the  inertia  tensor,  transposed  from  the  center  of  mass,  to  the  corner  of  the  cube  is 

/ l ML2  - \Mb 2 - \Mb 2 \ / 8 -3  -3  \ 

V-orner^  LUpb2  §M62  - j Mb2  = ~^Mb2  -3  8 -3 

\ Mb2  - \Mb 2 | Mb2  J \ -3  -3  8 / 

This  inertia  tensor  about  the  corner  of  the  cube,  is  the  same  as  that  obtained  by  direct  integration. 


c )  Principal  moments  of  inertia  The  coordinate  axis  frame  used  for  rotation  about  the  corner  of  the 
cube  is  not  a principal  axis  frame.  Therefore  let  us  diagonalize  the  inertia  tensor  to  find  the  principal 
axis  frame  the  principal  moments  of  inertia  about  a corner.  To  achieve  this  requires  solving  the  secular 
determinant 


(|  Mb2~l)  - \Mb 2 -I  Mb2 

- 1 Mb 2 (|M62  — /)  -J  Mb2 

Mb 2 -\Mb2  (|  Mb2-!) 


= 0 


The  value  of  a determinant  is  not  affected  by  adding  or  subtracting  any  row  or  column  from  any  other 
row  or  column.  Subtract  row  1 from  row  2 gives 


(|  Mb2~l)  -I  Mb2  - \Mb 2 

— i \Mb2  + I (g  M62-/)  0 
-\Mb2  -| Mb 2 (|  Mb2-I) 


= 0 


The  determinant  of  this  matrix  is  straightforward  to  evaluate  and  equals 


= 0 
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Thus  the  roots  are 


( \Mb2  0 0 

jcorner  = q 0 

V 0 0 %Mb2 


The  identical  roots  I22  = I33  = imply  that  the  principal  axis  associated  with  In  must  be  a symmetry 

axis.  The  orientation  can  be  found  by  substituting  In  into  the  above  equation 


({l}-I{I})-u=-Mb2 


= 0 


where  the  second  subscript  1 attached  to  w*  signifies  that 


this  solution  corresponds  to  In-  This  gives 


2uJn  — CJ21  ~ ^31  = 0 

—con  + 2lv2i  — CO31  = 0 

—ojn  ~~  0J21  + 2uj$i  = 0 

Solving  these  three  equations  gives  the  unit  vector  for  the  first  principal  axis  for  which  In  = \Mb2  to  be 

( 1 \ 

ei  = 4=  1 . This  can  be  repeated  to  find  the  other  two  principal  axes  by  substituting  T 2 = j^Mb2.  This 

V 1 / 

gives  for  the  second  principal  moment  I22 


/ -3  -3  -3  \ / w12  \ 

({I}-/{I})-w  = — Mb2\  —3  —3  -3  w22  =0 

\ ^3  —3  —3  J y W32  ) 

This  results  in  three  identical  equations  for  the  components  of  oj  but  all  three  equations  are  the  same,  namely 

OJ 12  + 0)22  + OJ32  = 0 

This  does  not  uniquely  determine  the  direction  of  w.  However,  it  does  imply  that  w>2  corresponding  to  the 
second  principal  axis  has  the  property  that 

U)  ■ = 0 

that  is,  any  direction  of  e2  that  is  perpendicular  to  e\  is  acceptable.  In  other  words;  any  two  orthogonal  unit 
vectors  e.2  and  B3  that  are  perpendicular  to  e\  are  acceptable.  This  ambiguity  exists  whenever  two  eigenvalues 
are  equal;  the  three  principal  axes  are  only  uniquely  defined  if  all  three  eigenvalues  are  different.  The  same 
ambiguity  exist  when  all  three  eigenvalues  are  identical  as  occurs  for  the  principal  moments  of  inertia  about 
the  center- of -mass  of  a uniform  solid  cube.  This  explains  why  the  principal  moment  of  inertia  for  the  diagonal 
of  the  cube,  that  passes  through  the  center  of  mass,  has  the  same  moment  as  when  the  principal  axes  pass 
through  the  center  of  the  faces  of  the  cube. 


11.9  Perpendicular-axis  theorem  for  plane  laminae 

Rigid-body  rotation  of  thin  plane  laminae  objects  is  encountered  frequently.  Examples  of  such  laminae 
bodies  are  a plane  sheet  of  metal,  a thin  door,  a bicycle  wheel,  a thin  envelope  or  book.  Deriving  the  inertia 
tensor  for  a plane  lamina  is  relatively  simple  because  there  are  limits  on  the  possible  relative  magnitude 
of  the  principal  moments  of  inertia.  Consider  that  the  principal  axis  are  along  the  x,y,z,  coordinate  axes. 
Then  the  sum  of  two  principal  moments  of  inertia  about  the  center  of  mass  are 

Ix  + Iy  = j p(y2  + z2)dV  + j p{x2  + z2)dV 

= j p{x2  + y2)dV  + 2 j pz2dV  > J p(x2  + y2)dV  = Iz 


(11.44) 
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Note  that  for  any  body  the  three  principal  moments  of  inertia  must  satisfy  the  triangle  rule  that  the  sum  of 
any  pair  must  exceed  or  equal  the  third.  Moreover,  if  the  body  is  a thin  lamina  with  thickness  z = 0,  that 
is,  a thin  plate  in  the  x — y plane,  then 

lx  + ly  = Iz  (11.45) 

This  perpendicular-axis  theorem  can  be  very  useful  for  solving  problems  involving  rotation  of  plane  laminae. 

The  opposite  of  a plane  laminae  is  a long  thin  cylindrical  needle  of  mass  m,  length  L,  and  radius  r. 
Along  the  symmetry  axis  the  principal  moments  are  Iz  = \ mr 2 — > 0 as  r — > 0,  while  perpendicular  to  the 
symmetry  axis  Ix  = Iy  = -p^mL2.  These  satisfy  the  triangle  rule. 

11.3  Example:  Inertia  tensor  of  a hula  hoop 

The  hula  hoop  is  a thin  plane  circular  ring  or  radius  R and  mass  M.  Assume  that  the  symmetry  axis  of 
the  circular  ring  is  the  3 axis. 

a)  The  principal  moments  of  inertia  about  the  center  of  mass:  The  principal  moment  of  inertia  along  the 
3 axis  is  1 33  = MR2.  Then  equation  11.45  plus  symmetry  tells  us  that  the  two  principal  moments  of  inertia 
in  the  plane  of  the  hula  hoop  must  be  In  = I-2 2 = \MR2 . 

b)  The  principal  moments  of  inertia  about  the  periphery  of  the  ring:  Using  the  Parallel-axis  theorem 
tells  us  that  the  moment  perpendicular  to  the  plane  of  the  hula  hoop  I33  = 2 MR2.  In  the  plane  of  the  hoop 
the  moment  tangential  to  the  hoop  is  In  = | MR 2 and  the  moment  radial  to  the  hoop  I22  = \MR2.  The 
hida  dancer  often  swings  the  hoop  about  the  periphery  and  perpendicular  to  the  plane  by  swinging  their  hips. 
Another  movement  is  jumping  through  the  hoop  by  rotating  the  hoop  tangential  to  the  periphery.  Calculation 
of  such  maneuvers  requires  knowledge  of  these  principal  moments  of  inertia. 

11.4  Example:  Inertia  tensor  of  a thin  book 

Consider  a thin  rectangular  book  of  mass  M,  width  a and  length  b with  thickness  t « a and  t « b. 
About  the  center  of  mass  the  inertia  tensor  perpendicular  to  the  plane  of  the  book  is  I33  = ||(a2  + b2).  The 
other  two  moments  are  In  = yfa2  and  I22  = f§ b2  which  satisfy  equation  11.45. 

11.10  General  properties  of  the  inertia  tensor 

11.10.1  Inertial  equivalence 

The  elements  of  the  inertia  tensor,  the  values  of  the  principal  moments  of  inertia,  and  the  orientation  of  the 
principal  axes  for  a rigid  body,  all  depend  on  the  choice  of  origin  for  the  system.  Recall  that  for  the  kinetic 
energy  to  be  separable  into  translational  and  rotational  portions,  the  origin  of  the  body  coordinate  system 
must  coincide  with  the  center  of  mass  of  the  body.  However,  for  any  choice  of  the  origin  of  any  body,  there 
always  exists  an  orientation  of  the  axes  that  diagonalizes  the  inertia  tensor. 

The  inertial  properties  of  a body  for  rotation  about  a specific  body-fixed  location  is  defined  completely 
by  only  three  principal  moments  of  inertia  irrespective  of  the  detailed  shape  of  the  body.  As  a result,  the 
inertial  properties  of  any  body  about  a body-fixed  point  are  equivalent  to  that  of  an  ellipsoid  that  has  the 
same  three  principal  moments  of  inertia.  The  symmetry  properties  of  this  equivalent  ellipsoidal  body  define 
the  symmetry  of  the  inertial  properties  of  the  body.  If  a body  has  some  simple  symmetry  then  usually  it  is 
obvious  as  to  what  will  be  the  principal  axes  of  the  body. 

Spherical  top:  T\  ~ I2  — I3 

A spherical  top  is  a body  having  three  degenerate  principal  moments  of  inertia.  Such  a body  has  the  same 
symmetry  as  the  inertia  tensor  about  the  center  of  a uniform  sphere.  For  a sphere  it  is  obvious  from  the 
symmetry  that  any  orientation  of  three  mutually  orthogonal  axes  about  the  center  of  the  uniform  sphere  are 
equally  good  principal  axes.  For  a uniform  cube  the  principal  axes  of  the  inertia  tensor  about  the  center  of 
mass  were  shown  to  be  aligned  such  that  they  pass  through  the  center  of  each  face,  and  the  three  principal 
moments  are  identical;  that  is,  inertially  it  is  equivalent  to  a spherical  top.  A less  obvious  consequence  of  the 
spherical  symmetry  is  that  any  orientation  of  three  mutually  perpendicular  axes  about  the  center  of  mass  of 
a uniform  cube  is  an  equally  good  principal  axis  system. 
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Symmetric  top:  I\  = I2  ^ I3 

The  equivalent  ellipsoid  for  a body  with  two  degenerate  principal  moments  of  inertia  is  a spheroid  which  has 
cylindrical  symmetry  with  the  cylindrical  axis  aligned  along  the  third  axis.  A body  with  J3  < R = I2  is  a 
prolate  spheroid  while  a body  with  I3  > I\  = I2  is  an  oblate  spheroid.  Examples  with  a prolate  spheroidal 
equivalent  inertial  shape  are  a rugby  ball,  pencil,  or  a baseball  bat.  Examples  of  an  oblate  spheroid  are  an 
orange,  or  a frisbee.  A uniform  sphere,  or  a uniform  cube,  rotating  about  a point  displaced  from  the  center- 
of-mass  also  behave  inertially  like  a symmetric  top.  The  cylindrical  symmetry  of  the  equivalent  spheroid 
makes  it  obvious  that  any  mutually  perpendicular  axes  that  are  normal  to  the  axis  of  cylindrical  symmetry 
are  equally  good  principal  axes  even  when  the  cross  section  in  the  1 — 2 plane  is  square  as  opposed  to  circular. 

A rotor  is  a diatomic-molecule  shaped  body  which  is  a special  case  of  a symmetric  top  where  I\  = 0, 
and  I2  = I3 . The  rotation  of  a rotor  is  perpendicular  to  the  symmetry  axis  since  the  rotational  energy  and 
angular  momentum  about  the  symmetry  axis  are  zero  because  the  principal  moment  of  inertia  about  the 
symmetry  axis  is  zero. 

Asymmetric  top:  Ji  / I2  7^  I3 

A body  where  all  three  principal  moments  of  inertia  are  distinct,  I\  / 12  / I3,  is  called  an  asymmetric 
top.  Some  molecules,  and  nuclei  have  asymmetric,  triaxially-deformed,  shapes. 

11.10.2  Orthogonality  of  principal  axes 

The  body-fixed  principal  axes  comprise  an  orthogonal  set,  for  which  the  vectors  L and  u>  are  simply  related. 
Components  of  L and  u)  can  be  taken  along  the  three  body-fixed  axes  denoted  by  i.  Thus  for  the  mth 
principal  moment  Im 

Cim  = Im^im  (11.46) 

Written  in  terms  of  the  inertia  tensor 


Cirn  ^ ' lik^km  — A? 


(11.47) 


Similarly  the  nttl  principal  moment  can  be  written  as 


Lkn  — ^ ^ Iki^in  — Ir 


W kn 


Multiply  the  equation  11.47  by  c Jin  and  sum  over  i gives 

^ ^ lik^km^in  — ^ ^ Irnm^im^in 
i,k  i 

Similarly  multiplying  equation  11.48  by  to  km  and  summing  over  k gives 

^ ^ Iki^km^in  ^ ^ ^-nn^km^kn 
i,k  k 


(11.48) 


(11.49) 


(11.50) 


The  left-hand  sides  of  these  equations  are  identical  since  the  inertia  tensor  is  symmetric,  that  is  = lk% . 
Therefore  subtracting  these  equations  gives 


That  is 


or 


^ ^ Imm^im^in  ^ ^ Inn^km^kn  — 0 
i k 

(-^rara  knn ) £ ^km^kn  — 0 

k 

(-^ram  1-nn)  — 0 


(11.51) 

(11.52) 

(11.53) 
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If  Ln  f -In  then 

UJ m ' OJ n = 0 (11.54) 

which  implies  that  the  m and  n principal  axes  are  perpendicular.  However,  if  Imm  = Inn  then  equation 
11.53  does  not  require  that  ujm  • ujn  = 0,  that  is,  these  axes  are  not  necessarily  perpendicular,  but,  with 
no  loss  of  generality,  these  two  axes  can  be  chosen  to  be  perpendicular  with  any  orientation  in  the  plane 
perpendicular  to  the  symmetry  axis. 

Summarizing  the  above  discussion,  the  inertia  tensor  has  the  following  properties. 

1)  Diagonalization  may  be  accomplished  by  an  appropriate  rotation  of  the  axes  in  the  body. 

2)  The  principal  moments  (eigenvalues)  and  principal  axes  (eigenvectors)  are  obtained  as  roots  of  the 
secular  determinant  and  are  real. 

3)  The  principal  axes  (eigenvectors)  are  real  and  orthogonal. 

4)  For  a symmetric  top  with  two  identical  principal  moments  of  inertia,  any  orientation  of  two  orthogonal 
axes  perpendicular  to  the  symmetry  axis  are  satisfactory  eigenvectors. 

5)  For  a spherical  top  with  three  identical  principal  moment  of  inertia,  the  principal  axes  system  can 
have  any  orientation  with  respect  to  the  origin. 

11.11  Angular  momentum  L and  angular  velocity  uj  vectors 

The  angular  momentum  is  a primary  observable  for  rotation.  As  discussed  in  chapter  11.5,  the  angular 
momentum  L is  compactly  and  elegantly  written  in  matrix  form  using  the  tensor  algebra  relation 

/ 111  I\2  Ii3  \ ( U>1  \ 

L=  I I21  I22  I23  I • I ^2  I = {1}  • uj  (11.55) 

\ I3I  I32  I33  / \ J 

where  uj  is  the  angular  velocity,  {1}  the  inertia  tensor,  and  L the  corresponding  angular  momentum. 

Two  important  consequences  of  equation  11.55  are  that: 

• The  angular  momentum  L and  angular  velocity  uj  are  not  necessarily  colinear. 

• In  general  the  Principal  axis  system  of  the  rotating  rigid  body  is  not  aligned  with  either  the  angular 
momentum  or  angular  velocity  vectors. 

An  exception  to  these  statements  occurs  when  the  angular  velocity  uj  is  aligned  along  a principal  axes 
for  which  the  inertia  tensor  is  diagonal,  i.e.  Ii:l  = Irfij , and  then  both  L and  uj  point  along  this  principal 
axis.  In  general  the  angular  momentum  L and  angular  velocity  uj  precess  around  each  other.  An  important 
special  case  is  for  torque-free  systems  where  Noether’s  theorem  implies  that  the  angular  momentum  vector 
L is  conserved  both  in  magnitude  and  amplitude.  In  this  case,  the  angular  velocity  u>,  and  the  Principal  axis 
system,  both  precesses  around  the  angular  momentum  vector  L.  That  is,  the  body  appears  to  tumble  with 
respect  to  the  laboratory  fixed  frame.  Understanding  rigid-body  rotation  requires  care  not  to  confuse  the 
body-fixed  Principal  axis  coordinate  frame,  used  to  determine  the  inertia  tensor,  and  the  fixed  laboratory 
frame  where  the  motion  is  observed. 

11.5  Example:  Rotation  about  the  center  of  mass  of  a solid  cube 

It  is  illustrative  to  use  the  inertia  tensors  of  a uniform  cube  to  compute  the  angular  momentum  for  any 
applied  angular  velocity  vector  uj  using  equation  (11.55).  If  the  angidar  velocity  is  along  the  x axis,  then 
using  the  inertia  tensor  for  a solid  cube,  derived  earlier,  in  equation  (11.55)  gives  the  angular  momentum  to 
be 

1 / 1 0 0 \ / 1 \ 1 

L = {1}  • w = -Mb2u  0 10  0=  -Mb2vj 

6 \ 0 0 1 / \ 0 / 6 

This  shows  that  L and  uj  are  colinear  and  thus  the  x axis  is  a principal  axis.  By  symmetry,  the  y and  z 
body  fixed  axis  also  must  be  principal  axes. 
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Consider  that  the  body  is  rotated  about  a diagonal  of  the  cube  for  which  the  center  of  mass  will  be  on 

1 


the  rotation  axis.  Then  the  angular  velocity  vector  is  written  as  uj  =u>-^  | 1 | where  the  components  of 
u>x  = ui.y  = u)z  = with  the  angular  velocity  magnitude  Xj  + uj^  + col  = uj. 


{1}  • w = ^Mtfuj^  I 0 1 0 
6 73  1 


i 


-Mb2uj 

6 


Note  that  L and  uj  again  are  colinear  showing  it  also  is  a principal  axis.  Moreover,  the  magnitude  of  L 
is  identical  for  orientations  of  the  rotation  axes  uj  passing  through  the  center  of  mass  when  centered  on 
either  one  face,  or  the  diagonal,  of  the  cube  implying  that  the  principal  moments  of  inertia  about  these  axes 
are  identical.  This  illustrates  the  important  property  that,  when  the  three  principal  moments  of  inertia  are 
identical,  then  any  orientation  of  the  coordinate  system  is  an  equally  good  principal  axis  system.  That  is, 
this  corresponds  to  the  spherical  top  where  all  orientations  are  principal  axes,  not  just  along  the  obvious 
symmetry  axes. 


11.6  Example:  Rotation  about  the  comer  of  the  cube 

Let  us  repeat  the  above  exercise  for  rotation  about  one  corner  of  the  cube.  Consider  that  the  angular 
velocity  is  along  the  x axis.  Then  example  (11.2)  gives  the  angular  momentum  to  be 


. / +8  ^3  —3 

L = {1}  • uj  = —Mb2uj  [ —3  +8  —3 

12  \ -3  -3  +8 


= — Mb2u> 
12 


+8 

-3 

3 


The  angular  momentum  is  far  from  being  aligned  with  the  axis  uj,  that  is,  it  is  not  a principal  axis. 

Consider  that  the  body  is  rotated  with  the  angular  velocity  aligned  along  a diagonal  of  the  cube  through 

1 

the  center  of  mass  on  this  axis.  Then  the  angular  velocity  is  written  as  w | 1 ] where  the  components 


\ 

of  ujx  = ujy  = ujz  = -^=  ensuring  that  the  magnitude  equals  ■Ju>2  = uj. 


. , / +8  -3  -3 

L = {1}  • uj  = -f-Mb2uj-=  -3  +8  -3 

12  ^3  y —3  —3  +8 


= Mb2uj 
12  73 


2 

2 

2 


-Mb2uj 

6 


This  is  a principal  axis  since  L and  uj  again  are  colinear  and  the  angular  momentum  is  the  same  as  for  any 
axis  through  the  center  of  mass  of  a uniform  solid  cube  due  to  the  high  symmetry  of  the  cube.  If  the  angular 
velocity  is  perpendicular  to  the  diagonal  of  the  cube,  then,  for  either  of  these  perpendicidar  axes,  the  relation 
between  L and  uj  is  given  by 


L = 1-Mb2uj^= 
12  72 


+8 

-3 

-3 

-3 

+8 

-3 

-3 

-3 

+8 

= j-Mb2uj-^= 
12  72 


11 


-1 


= -Mb2uj\  +1 

' 0 


Note  that  this  must  be  a principal  axis  for  rotation  about  a corner  of  the  cube  since  L and  uj  are  colinear. 
The  angular  momentum  is  the  same  for  both  possible  orientations  of  uj  that  are  perpendicular  to  the  diagonal 
through  the  center  of  mass.  Diagonalizing  the  inertia  tensor  in  example  11.2  also  gave  the  above  result  with 
the  symmetry  axis  along  the  diagonal  of  the  cube. 

This  example  illustrates  that  it  is  not  necessary  to  diagonalize  the  inertia  tensor  matrix  to  obtain  the 
principal  axes.  The  corner  of  the  cube  has  three  mutually  perpendicular  principal  axes  independent  of  the 
choice  of  a body-fixed  coordinate  frame.  The  advantage  of  the  principal  axis  coordinate  frame  is  that  the 
inertia  tensor  is  diagonal  making  evaluation  of  the  angular  momentum  trivial.  That  is,  there  is  no  physics 
associated  with  the  orientation  chosen  for  the  body-fixed  coordinate  frame,  this  frame  only  determines  the 
ratio  of  the  components  of  the  inertia  tensor  along  the  chosen  coordinates.  Note  that,  if  a body  has  an  obvious 
symmetry,  then  intuition  is  a powerful  way  to  identify  the  principal  axis  frame. 
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11.12  Kinetic  energy  of  rotating  rigid  body 

Another  important  observable  is  the  kinetic  energy  of  rotation.  Consider  a rigid  body  composed  of  N 
particles  of  mass  ma  where  a = 1,2,3,  ...N . If  the  body  rotates  with  an  instantaneous  angular  velocity  co 
about  some  fixed  point,  with  respect  to  the  body  coordinate  system,  and  this  point  has  an  instantaneous 
translational  velocity  V with  respect  to  the  fixed  (inertial)  coordinate  system,  see  figure  11.1,  then  the 
instantaneous  velocity  va  of  the  ath  particle  in  the  fixed  frame  of  reference  is  given  by 

va  = V + v"  + uj  x r'a  (11.56) 


However,  for  a rigid  body,  the  velocity  of  a body-fixed  point  with  respect  to  the  body  is  zero,  that  is  v"  = 0, 
thus 

va  = V + uxr,a  (11.57) 

The  total  kinetic  energy  is  given  by 

N i N I 

T = ^2  ^m«v«  • vQ  = ^ 2TO“  (V  + u x r'J  • (V  + u:  x r'J 

a.  a. 

1 N N i N 

= maV2  + m“V  • w x 4 + ^ (w  x O • (w  x r'J 

a.  i a. 

This  is  a general  expression  for  the  kinetic  energy  that  is  valid  for  any  choice  of  the  origin  from  which  the 
body-fixed  vectors  r'a  are  measured.  However,  if  the  origin  is  chosen  to  be  the  center  of  mass,  then,  and  only 
then,  the  middle  term  cancels.  That  is,  since  V • u>  is  independent  of  the  specific  particle,  then 


(11.58) 


N 

rnaV  'UXr„=V'UX 


But  the  definition  of  the  center  of  mass  is 


ma  v'  = MR 


(11.59) 


(11.60) 


and  R = 0 in  the  body-fixed  frame  if  the  selected  point  in  the  body  is  the  center  of  mass.  Thus,  when  using 
the  center  of  mass  frame,  the  middle  term  of  equation  11.58  is  zero.  Therefore,  for  the  center  of  mass  frame, 
the  kinetic  energy  separates  into  two  terms  in  the  body-fixed  frame 


T — Ttrans 


T 

-L  r, 


(11.61) 


where 


Ttrans 


Trot 


1 

2 


N 


a 


1 

2 


N 

x r'J  • (w  x r'J 


Oi 


(11.62) 


The  vector  identity 

(A  x B)  • (A  x B)  = A2B2  - (A  ■ B)2 


can  be  used  to  simplify  Trot 


a 


2 / 2 
W r„ 


- • r'aY 


(11.63) 


(11.64) 


The  rotational  kinetic  energy  Trot  can  be  expressed  in  terms  of  components  of  cj  and  r'a  in  the  body-fixed 
frame.  Also  the  following  formulae  are  greatly  simplified  if  r'a  = (xa,  ya,za)  in  the  rotating  body- fixed  frame 
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is  written  in  the  form  r'a  = {xa,i,xa,2,xa,z)  where  the  axes  are  defined  by  the  numbers  1,2,3  rather  than 
x,  y,  z.  In  this  notation  the  rotational  kinetic  energy  is  written  as 


1 


Trot  — 2 ^ 


N 

E- 


Assume  the  Kronecker  delta  relation 


UJ, 


= £WA 


(11.65) 


(11.66) 


where  5ij  = 1 if  i = j and  Sij  = 0 if  i ^ j. 

Then  the  kinetic  energy  can  be  written  more  compactly 


rp  — 

rnt. 


a 

1 N 3 

(EE 


mn 


Xct,kJ  UJjXcx,j 

(OJiUJ  j Sij  ) ^ (CC^^Eqt^) 


Q! 

3 


= 


■ AT 

E 


mn 


\ k 

s .(yv 

2.^  Xa>k 


(11.67) 


The  term  in  the  outer  square  brackets  is  the  inertia  tensor  defined  in  equation  11.12  for  a discrete  body.  The 
inertia  tensor  components  for  a continuous  body  are  given  by  equation  11.13. 

Thus  the  rotational  component  of  the  kinetic  energy  can  be  written  in  terms  of  the  inertia  tensor  as 


Trot  o ^ A 


(11.68) 


Note  that  when  the  inertia  tensor  is  diagonal  ,then  the  evaluation  of  the  kinetic  energy  simplifies  to 

3 


T 

j.  rt 


- iuu}  1 


(11.69) 


which  is  the  familiar  relation  in  terms  of  the  scalar  moment  of  inertia  I discussed  in  elementary  mechanics. 
Equation  11.68  also  can  be  factored  in  terms  of  the  angular  momentum  L. 


Trot  — „ ^ y ] Wi  'y  ' IijUJj  — ^ ' tOiLi 


(11.70) 


hi 


As  mentioned  earlier,  tensor  algebra  is  an  elegant  and  compact  way  of  expressing  such  matrix  operations. 
Thus  it  is  possible  to  express  the  rotational  kinetic  energy  as 

1 / hi  1 12  1 13 

Trot  = ^ ( Wl  u>2  w3  ) • I I21  I 22  I23 

\ I31  h 2 I33 

Trot  = T = —u  • {1}  • Ll) 

where  the  rotational  energy  T is  a scalar.  Using  equation  11.55  the 
energy  also  can  be  written  as 

T^t  = T = iw  • L (11.73) 

which  is  the  same  as  given  by  (11.70).  It  is  interesting  to  realize  that  even  though  L = {1}  • u>  is  the  inner 
product  of  a tensor  and  a vector,  it  is  a vector  as  illustrated  by  the  fact  that  the  inner  product  Trot  = |n;-L  = 
Itv  ■ ({1}  • u>)  is  a scalar.  Note  that  the  translational  kinetic  energy  Ttrans  must  be  added  to  the  rotational 
kinetic  energy  Trot  to  get  the  total  kinetic  energy  as  given  by  equation  11.61. 


(11.71) 

(11.72) 

rotational  component  of  the  kinetic 
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11.13  Euler  angles 


The  description  of  rigid-body  rotation  is  greatly  facil- 
itated by  transforming  from  the  space-fixed  coordinate 
frame  (x,  y,  z)  to  a rotating  body-fixed  coordinate  frame 
(l,  2, 3)  for  which  the  inertia  tensor  is  diagonal.  Appen- 
dix D introduced  the  rotation  matrix  {A}  which  can  be 
used  to  rotate  between  the  space-fixed  coordinate  sys- 
tem, which  is  stationary,  and  the  instantaneous  body- 
fixed  frame  which  is  rotating  with  respect  to  the  space- 
fixed  frame.  The  transformation  can  be  represented  by 
a matrix  equation 


(1,2,3)  = {A}  • (x,y,z) 


(11.74) 


where  the  space- fixed  system  is  identified  by  unit  vectors 
(x,  y,  z)  while  (l,  2,  3)  defines  unit  vectors  in  the  rotated 
body- fixed  system.  The  rotation  matrix  {A}  completely 
describes  the  instantaneous  relative  orientation  of  the 
two  systems.  Rigid-body  rotation  requires  three  inde- 
pendent angular  parameters  that  specify  the  orientation 
of  the  rigid  body  such  that  the  corresponding  orthog- 
onal transformation  matrix  is  proper,  that  is,  it  has  a 
determinant  |A|  = +1  as  given  by  equation  (D. 33). 

As  discussed  in  Appendix  D.  2,  the  9 component  ro- 
tation matrix  involves  only  three  independent  angles. 
There  are  many  possible  choices  for  these  three  angles. 

It  is  convenient  to  use  the  Euler  angles,  (p,9,ip,  (also 
called  Eulerian  angles)  shown  in  figure  11.3. 1 The  Euler 
angles  are  generated  by  a series  of  three  rotations  that 
rotate  from  the  space-fixed  (x,  y,  z)  system  to  the  body- 
fixed  (1,2,3)  system.  The  rotation  must  be  such  that 
the  space-fixed  z axis  rotates  by  an  angle  9 to  align  with 
the  body-fixed  3 axis.  This  can  be  performed  by  rotating 
through  an  angle  9 about  the  n = z x 3 direction,  where 
z and  3 designate  the  unit  vectors  along  the  ”z”  axes 
of  the  space  and  body  fixed  frames  respectively.  The 
unit  vector  n = z x 3 is  the  vector  normal  to  the  plane 
defined  by  the  z and  3 unit  vectors  and  this  unit  vector  n 


Figure  11.3:  The  z — x — z sequence  of  rotations 
X^,Xg,X ip  corresponding  to  the  Eulerian  angles 
((f>, 9, ip).  The  first  rotation  (p  about  the  space- 
fixed  z axis  (blue)  is  from  the  x-axis  (blue)  to  the 
line  of  nodes  n (green).  The  second  rotation  9 
about  the  line  of  nodes  (green)  is  from  the  space- 
fixed  2 axis  (blue)  to  the  body-fixed  3-axis  (red). 
The  third  rotation  ip  about  the  body-fixed  3-axis 
(red)  is  from  the  line  of  nodes  (green)  to  the  body- 
fixed  1 axis  (red). 

z x 3 is  called  the  line  of  nodes.  The  chosen 


convention  is  that  the  unit  vector  n = z x 3 is  along  the  ”x”  axis  of  an  intermediate-axis  frame  designated 
by  (n,  y , z) , that  is,  the  unit  vector  n = z x 3 plus  the  unit  vectors  y'  and  z are  in  the  same  plane  as  the  z 
and  3 unit  vectors.  The  sequence  of  three  rotations  is  performed  as  summarized  below. 


1)  Rotation  <p  about  the  space-fixed  z axis  from  the  space  x axis  to  the  line  of  nodes  n : The 

first  rotation  (x,  y,  z)  • A^,  — > (n,  y',  z)  is  in  a right-handed  direction  through  an  angle  (p  about  the  space-fixed 
z axis.  Since  the  rotation  takes  place  in  the  x — y plane,  the  transformation  matrix  is 

(cos  (p  sin</>  0 \ 

— sin  <p  cos  (p  0 I (11.75) 

0 0 1/ 

1The  space-fixed  coordinate  frame  and  the  body-fixed  coordinate  frames  are  unambiguously  defined,  that  is,  the  space-fixed 
frame  is  stationary  while  the  body-fixed  frame  is  the  principal-axis  frame  of  the  body.  There  are  several  possible  intermediate 
frames  that  can  be  used  to  define  the  Euler  angles.  The  z — x — z sequence  of  rotations,  used  here,  is  used  in  most  physics 
textbooks  in  classical  mechanics.  Unfortunately  scientists  and  engineers  use  slightly  different  conventions  for  defining  the  Euler 
angles.  As  discussed  in  Appendix  A of  "Classical  Mechanics"  by  Goldstein,  nuclear  and  particle  physicists  have  adopted  the 
z — y — z sequence  of  rotations  while  the  US  and  UK  aerodynamicists  have  adopted  a x — y — z sequence  of  rotations. 
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This  leads  to  the  intermediate  coordinate  system  (n,  y',  z)  where  the  rotated  x axis  now  is  colinear  with  the 
n axis  of  the  intermediate  frame,  that  is,  the  line  of  nodes. 

(n,  y',  z)  = {A0}  • (x,  y,  z)  (11.76) 

The  precession  angular  velocity  if  is  the  rate  of  change  of  angle  of  the  line  of  nodes  with  respect  to  the  space 
x axis  about  the  space-fixed  2 axis. 


2)  Rotation  9 about  the  line  of  nodes  n from  the  space  z axis  to  the  body-fixed  3 axis:  The 

second  rotation 

(n,y',z)  • Xg  ->  (n,y",3)  (11.77) 

is  in  a right-handed  direction  through  the  angle  9 about  the  n axis  (line  of  nodes)  so  that  the  ”2”  axis  becomes 
colinear  with  the  body- fixed  3 axis.  Because  the  rotation  now  is  in  the  z — 3 plane,  the  transformation  matrix 
is 

/!  ° 0 \ 

{A0}  = I 0 cos0  sin  0 (11.78) 

y 0 — sinf?  cos  9 ) 

The  line  of  nodes  which  is  at  the  intersection  of  the  space-fixed  and  body-fixed  planes,  shown  in  figure  11.3, 
points  in  the  n = z x 3 direction.  The  new  ”2”  axis  now  is  the  body-fixed  3 axis.  The  angular  velocity  9 is 
the  rate  of  change  of  angle  of  the  body-fixed  3-axis  relative  to  the  space-fixed  z-axis  about  the  line  of  nodes. 


3)  Rotation  ip  about  the  body-fixed  3 axis  from  the  line  of  nodes  to  the  body-fixed  1 axis:  The 

third  rotation 

(n,y",3)-A^(i,2,3)  (11.79) 

is  in  a right-handed  direction  through  the  angle  ip  about  the  new  body-fixed  3 axis.  This  third  rotation 
transforms  the  rotated  intermediate  (n,  y",3)  frame  to  final  body-fixed  coordinate  system  (1,2,3).  The 
transformation  matrix  is 

(cos  ip  sin  ip  0 \ 

— sin^  cos  ip  0 I (11.80) 

0 0 1/ 

The  spin  angular  velocity  ip  is  the  rate  of  change  of  the  angle  of  the  body-fixed  1-axis  with  respect  to  the 
line  of  nodes  about  the  body-fixed  3 axis. 

The  total  rotation  matrix  {A}  is  given  by 

{A}  = {A4-{A0}.{A0}  (11.81) 

Thus  the  complete  rotation  from  the  space-fixed  (x,  y,  z)  axis  system  to  the  body-fixed  (1,  2,  3)  axis  system 
is  given  by 

(1,  2,  3)  = {A}  • (x,  y,  z)  (11.82) 

where  {A}  is  given  by  the  triple  product  equation  (11.81)  leading  to  the  rotation  matrix 

(cos  (p  cos  ip  — sin  <p  cos  9 sin  ip  sin  <p  cos  ip  + cos  <p  cos  9 sin  ip  sin  9 sin  ip  \ 

— cos  (p  sin  ip  — sin  <p  cos  6 cos  ip  — sin  <p  sin  ip  + cos  (p  cos  6 cos  ip  sin9cosip  I (11.83) 

sin  (p  sin  9 — cos  <p  sin  9 cos  9 ) 

The  inverse  transformation  from  the  body-fixed  axis  system  to  the  space-fixed  axis  system  is  given  by 

(x,  y,  z)  = {A}”1  • (1,2,3)  (11.84) 

where  the  inverse  matrix  {A}-1  equals  the  transposed  rotation  matrix  {A}T,  that  is, 


{A}  1 = {A}T 


cos  <p  cos  ip  — sin  <p  cos  9 sin  ip  — cos  <p  sin  ip  — sin  <p  cos  9 cos  ip  sin  (p  sin  9 \ 

sin  (p  cos  ip  + cos  (p  cos  9 sin  ip  — sin  <p  sin  ip  + cos  (p  cos  9 cos  ip  —cos(psm9  (11.85) 

sin  9 sin  ip  sin  9 cos  ip  cos  9 I 
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Taking  the  product  {A}  {A}  1 = 1 shows  that  the  rotation  matrix  is  a proper,  orthogonal,  unit  matrix. 

The  use  of  three  different  coordinate  systems,  space-fixed,  the  intermediate  line  of  nodes,  and  the  body- 
fixed  frame  can  be  confusing  at  first  glance.  Basically  the  angle  </>  specifies  the  rotation  about  the  space-fixed 
z axis  between  the  space-fixed  x axis  and  the  line  of  nodes  of  the  Euler  angle  intermediate  frame.  The  angle 
ip  specifies  the  rotation  about  the  body-fixed  3 axis  between  the  line  of  nodes  and  the  body-fixed  1 axis.  Note 
that  although  the  space-fixed  and  body-fixed  axes  systems  each  are  orthogonal,  the  Euler  angle  basis  in 
general  is  not  orthogonal.  For  rigid-body  rotation  the  rotation  angle  <p  about  the  space-fixed  2 axis  is  time 
dependent,  that  is,  the  line  of  nodes  is  rotating  with  an  angular  velocity  (p  with  respect  to  the  space-fixed 
coordinate  frame.  Similarly  the  body-fixed  coordinate  frame  is  rotating  about  the  body-fixed  3 axis  with 
angular  velocity  ip  relative  to  the  line  of  nodes. 


11. 7 Example:  Euler  angle  transformation 


The  definition  of  the  Euler  angles  can  be  confusing,  therefore  it  is  useful  to  illustrate  their  use  for  a 
rotational  transformation  of  a primed  frame  ( x',y',z ’)  to  an  unprimed  frame  ( x,y,z ).  Assume  the  first 
rotation  about  the  z'  axis,  is  <p  = 30° 
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Let  the  second  rotation  be  9 = 45°  about  the  line  of  nodes,  that  is,  the  intermediate  x”  axis.  Then 


A e = 


Let  the  third  rotation  be  ip  = 90°  about  the  z axis. 
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Thus  the  net  rotation  corresponds  to  A = A^A^A^ 
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11.14  Angular  velocity  lc 

It  is  useful  to  relate  the  rigid-body  equations  of  motion  in  the  space-fixed  (x,  y,  z)  coordinate  system  to 
those  in  the  body-fixed  (ei,e2,e3)  coordinate  system  where  the  principal  axis  inertia  tensor  is  defined.  It 
was  shown  in  appendix  D that  an  infinitessimal  rotation  can  be  represented  by  a vector.  Thus  the  time 
derivatives  of  these  rotation  angles  can  be  associated  with  the  components  of  the  angular  velocity  tu,  where 
the  precession  = <j>,  the  nutation  uig  = 9,  and  the  spin  = ip.  Unfortunately  the  coordinates  {(p,9,ip) 
are  with  respect  to  mixed  coordinate  frames  and  thus  are  not  orthogonal  axes.  That  is,  the  Euler  angular 
velocities  are  expressed  in  different  coordinate  frames,  where  the  precession  cp  is  around  the  space-fixed  z 
axis  measured  relative  to  the  x-axis,  the  spin  ip  is  around  the  body-fixed  £3  axis  relative  to  the  rotating 
line-of- nodes,  and  the  nutation  9 is  the  angular  velocity  between  the  z and  e3  axes  and  points  along  the 
instantaneous  line-of- nodes  in  the  £3  x z direction.  By  reference  to  figure  11.3  it  can  be  seen  that  the 
components  along  the  body- fixed  axes  are  as  given  in  Table  11.1. 


Table  11.1;  Euler  angular  velocity  components  in  the  body- fixed  frame 


Precession  f 

Nutation  9 

Spin  ip 

fi  = (p  sin  9 sin  ip 

9\  = 9 cos  ip 

ip1=0 

Pp2  = (p  sin  9 cos  ip 

92  = —9  simp 

ip2  = 0 

cp3  = ip  cos  9 

93  = 0 

ip3  = ip 
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Note  that  the  precession  angular  velocity  ip  is  the  angular  velocity  that  the  body-fixed  £3  and  z x 3 axes 
precess  around  the  space-fixed  z axis.  Table  11.1  gives  the  Euler  angular  velocities  required  to  calculate 
the  components  of  the  angular  velocity  a>  for  the  body-fixed  (1,  2,  3)  axis  system.  Collecting  the  individual 
components  of  u>,  gives  the  components  of  the  angular  velocity  of  the  body,  relative  to  the  space-fixed  axes, 
in  the  body-fixed  axis  system  (1, 2, 3) 

uj  1 = ipx  + 9\  + ip1  = 0 sin  0 sin  -0  + 9 cosip  (11.86) 

u> 2 = </>2  + $2  + — 0 sin  9 cos  ip  — 9 sin  ip  (11.87) 

w3  = <t>3  + + ips  = <P  cos  9 + ip  (11.88) 

The  angular  velocity  of  the  body  about  the  body-fixed  3-axis,  U3,  is  the  sum  of  the  projection  of  the 
precession  angular  velocity  of  the  line-of-nodes  ip  with  respect  to  the  space-fixed  x-axis,  plus  the  angular 
velocity  ip  of  the  body-fixed  3-axis  with  respect  to  the  rotating  line-of-nodes. 

Similarly,  the  components  of  the  body  angular  velocity  u for  the  space-fixed  axis  system  ( x,y,z ) can  be 
derived  to  be 


ujx  = 9 cos  f + ip  sin  9 sin  <p  (11.89) 

uiy  = 9 sin  (p  — ip  sin  9 cos  <p  (11.90) 

uz  = ip  + ip  cos  9 (11.91) 


Note  that  when  9 = 0 then  the  Euler  angles  are  singular  in  that  the  space-fixed  2 axis  is  parallel  with 
the  body-fixed  3 axis  and  there  is  no  way  of  distinguishing  between  precession  ip  and  spin  ip,  leading  to 
u!z  = L03  = ip  + ip.  When  9 = n then  the  z axis  and  3 axis  are  antiparallel  and  toz  = ip  — ip  = — w3.  The  other 
special  case  is  when  cos  9 = 0 for  which  the  Euler  angle  system  is  orthogonal  and  the  space-fixed  ojz  = <p, 
that  is,  it  equals  the  precession,  while  the  body-fixed  0J3  = ip,  that  is,  it  equals  the  spin.  When  the  Euler 
angle  basis  is  not  orthogonal  then  equations  (11.86  — 88)  and  (11.89  — 91)  are  needed  for  expressing  the 
Euler  equations  of  motion  in  either  the  body-fixed  frame  or  the  space-fixed  frame  respectively. 

Equations  11.86  — 88  for  the  components  of  the  angular  velocity  in  the  body-fixed  frame  can  be  expressed 
in  terms  of  the  Euler  angle  velocities  in  a matrix  form  as 

(oj\  \ / sin  9 sin?/)  cos  ip  0 \ / <p  \ 

w2  ) = [ sin  9 cos  ip  —sin  ip  0 ] • | 9 I (11.92) 

0J3  ) \ cos  9 0 1 / \ ip  ) 

again  note  that  the  transformation  matrix  is  not  orthogonal  which  is  to  be  expected  since  the  Euler  angular 

velocities  are  about  axes  that  do  not  form  a rectangular  system  of  coordinates.  Similarly  equations  11.89  — 91 
for  the  angular  velocity  in  the  space-fixed  frame  can  be  expressed  in  terms  of  the  Euler  angle  velocities  in 
matrix  form  as 

(u>x  \ / 0 cos  <p  sin  9 sin  <p  \ / </>  \ 

I = I 0 sindi  sin  9 cos  6 I • [ 9 | (11.93) 

wj  V 1 0 cos  9 ) \ ip  ) 


11.15  Kinetic  energy  in  terms  of  Euler  angular  velocities 


The  kinetic  energy  is  a scalar  quantity  and  thus  is  the  same  in  both  stationary  and  rotating  frames  of 
reference.  It  is  much  easier  to  evaluate  the  kinetic  energy  in  the  rotating  Principal-axis  frame  since  the 
inertia  tensor  is  diagonal  in  the  Principal-axis  frame  as  given  in  equation  11.69 


i 


(11.94) 


Using  equation  11.86  — 88  for  the  body-fixed  angular  velocities  gives  the  rotational  kinetic  energy  in  terms 
of  the  Euler  angular  velocities  and  principal-frame  moments  of  inertia  to  be 


Trnt  — 


1 


I\  [ ip  sin  9 simp  + 9 cos  ip ) + I2  ( ip  sin  9 cos  ip  — 6 sin  ip  ) + /3  ( ip  cos  9 + ip 


(11.95) 
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11.16  Rotational  invariants 

The  scalar  properties  of  a rotating  body,  such  as  mass  M,  Lagrangian  L,  and  Hamiltonian  H , are  rotationally 
invariant,  that  is,  they  are  the  same  in  any  body-fixed  or  laboratory-fixed  coordinate  frame.  This  fact  also 
applies  to  scalar  products  of  all  vector  observables  such  as  angular  momentum.  For  example  the  scalar 
product 

L-L=^2  (11.96) 

where  l is  the  root  mean  square  value  of  the  angular  momentum.  An  example  of  a scalar  invariant  is  the 
scalar  product  of  the  angular  velocity 

u-uj=u2  (11.97) 

where  w2  is  the  mean  square  angular  velocity.  The  scalar  product  ui  ■ to  = |w|2  can  be  calculated  using  the 
Euler-angle  velocities  for  the  body-fixed  frame,  equations  11.86  — 88,  to  be 

u>  ■ oj  = |w|2  = u\  + T 6“  + V*2  + cos 9 

Similarly,  the  scalar  product  can  be  calculated  using  the  Euler  angle  velocities  for  the  space-fixed  frame 
using  equations  11.89  — 91. 

w • u > = |w|2  = w2  + ui2  + ui2  = <t>2  + 6~  + 'ip2  + 2cf)ipcos6 

This  shows  the  obvious  result  that  the  scalar  product  u>  ■ u>  = |w|2  is  invariant  to  rotations  of  the  coordinate 
frame,  that  is,  it  is  identical  when  evaluated  in  either  the  space-fixed,  or  body-fixed  frames. 

Note  that  for  0 = 0,  the  3 and  z axes  are  parallel,  and  perpendicular  to  the  9 axis,  then 

M2  = (</>  + v>)  +e2 

For  the  case  when  9 = 180°,  the  3 and  z axes  are  antiparallel,  and  perpendicular  to  the  9 axis,  then 


For  the  case  when  9 = 90°,  the  3 , z , and  9 axes  are  mutually  perpendicular,  that  is,  orthogonal,  and  then 

o -2  -2  -2 

\uj\2  = ij>  +ip  +9 

The  time-averaged  shape  of  a rapidly-rotating  body,  as  seen  in  the  fixed  inertial  frame,  is  very  different 
from  the  actual  shape  of  the  body,  and  this  difference  depends  on  the  rotational  frequency.  For  example,  a 
pencil  rotating  rapidly  about  an  axis  perpendicular  to  the  body-fixed  symmetry  axis  has  an  average  shape 
that  is  a flat  disk  in  the  laboratory  frame  which  bears  little  resemblance  to  a pencil.  The  actual  shape  of  the 
pencil  could  be  determined  by  taking  high-speed  photographs  which  display  the  instantaneous  body-fixed 
shape  of  the  object  at  given  times.  Unfortunately  for  fast  rotation,  such  as  rotation  of  a molecule  or  a 
nucleus,  it  is  not  possible  to  take  photographs  with  sufficient  speed  and  spatial  resolution  to  observe  the 
instantaneous  shape  of  the  rotating  body.  What  is  measured  is  the  average  shape  of  the  body  as  seen  in  the 
fixed  laboratory  frame.  In  principle  the  shape  observed  in  the  fixed  inertial  frame  can  be  related  to  the  shape 
in  the  body-fixed  frame,  but  this  requires  knowing  the  body-fixed  shape  which  in  general  is  not  known.  For 
example,  a deformed  nucleus  may  be  both  vibrating  and  rotating  about  some  triaxially  deformed  average 
shape  which  is  a function  of  the  rotational  frequency.  This  is  not  apparent  from  the  shapes  measured  in  the 
fixed  frame  for  each  of  the  excited  states. 

The  fact  that  scalar  products  are  rotationally  invariant,  provides  a powerful  means  of  transforming  prod- 
ucts of  observables  in  the  body-fixed  frame,  to  those  in  the  laboratory  frame.  In  1971  Cline  developed 
a powerful  model-independent  method  that  utilizes  rotationally-invariant  products  of  the  electromagnetic 
quadrupole  operator  E 2 to  relate  the  electromagnetic  E2  properties  for  the  observed  levels  of  a rotating 
nucleus  measured  in  the  laboratory  frame,  to  the  electromagnetic  E2  properties  of  the  deformed  rotating 
nucleus  measured  in  the  body- fixed  frame. [Cli71,  Cli72,  Cli86]  The  method  uses  the  fact  that  scalar  products 
of  the  electromagnetic  multipole  operators  are  rotationally  invariant.  This  allows  transforming  scalar  prod- 
ucts of  a complete  set  of  measured  electromagnetic  matrix  elements,  measured  in  the  laboratory  frame,  into 
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the  electromagnetic  properties  in  the  body-fixed  frame  of  the  rotating  nucleus.  These  rotational  invariants 
provide  a model-independent  determination  of  the  magnitude,  triaxiality,  and  vibrational  amplitudes  of  the 
average  shapes  in  the  body-fixed  frame  for  individual  observed  nuclear  states  that  may  be  undergoing  both 
rotation  and  vibration.  When  the  bombarding  energy  is  below  the  Coulomb  barrier,  the  scattering  of  a 
projectile  nucleus  by  a target  nucleus  is  due  purely  to  the  electromagnetic  interaction  since  the  distance 
of  closest  approach  exceeds  the  range  of  the  nuclear  force.  For  such  pure  Coulomb  collisions,  the  electro- 
magnetic excitation  of  collective  nuclei  populates  many  excited  states,  as  illustrated  in  figure  12.13,  with 
cross  sections  that  are  a direct  measure  of  the  E 2 matrix  elements.  These  measured  matrix  elements  are 
precisely  those  required  to  evaluate,  in  the  laboratory  frame,  the  E 2 rotational  invariants  from  which  it  is 
possible  to  deduce  the  intrinsic  quadrupole  shapes  of  the  rotating-vibrating  nuclear  states  in  the  body-fixed 
frame  [Cli86]. 


11.17  Euler’s  equations  of  motion  for  rigid-body  rotation 


Rigid-body  rotation  can  be  confusing  in  that  two  coordinate  frames  are  involved  and,  in  general,  the  angular 
velocity  and  angular  momentum  are  not  aligned.  The  motion  of  the  rigid  body  is  observed  in  the  space-fixed 
inertial  frame  whereas  it  is  simpler  to  calculate  the  equations  of  motion  in  the  body-fixed  principal  axis 
frame,  for  which  the  inertia  tensor  is  known  and  is  constant.  The  rigid  body  is  rotating  about  the  angular 
velocity  vector  u>,  which  is  not  aligned  with  the  angular  momentum  L.  For  torque-free  motion,  L is  conserved 
and  has  a fixed  orientation  in  the  space-fixed  axis  system.  Euler’s  equations  of  motion,  presented  below, 
are  given  in  the  body-fixed  frame  for  which  the  inertial  tensor  is  known  since  this  simplifies  solution  of  the 
equations  of  motion.  However,  this  solution  has  to  be  rotated  back  into  the  space-fixed  frame  to  describe 
the  rotational  motion  as  seen  by  an  observer  in  the  inertial  frame. 

This  chapter  has  introduced  the  inertial  properties  of  a rigid  body,  as  well  as  the  Euler  angles  for 
transforming  between  the  body-fixed  and  inertial  frames  of  reference.  This  has  prepared  the  stage  for 
solving  the  equations  of  motion  for  rigid-body  motion,  namely,  the  dynamics  of  rotational  motion  about  a 
body-fixed  point  under  the  action  of  external  forces.  The  Euler  angles  are  used  to  specify  the  instantaneous 
orientation  of  the  rigid  body. 

In  Newtonian  mechanics,  the  rotational  motion  is  governed  by  the  equivalent  Newton’s  second  law  given 
in  terms  of  the  external  torque  N and  angular  momentum  L 


N = 


(11.98) 


Note  that  this  relation  is  expressed  in  the  inertial  space-fixed  frame  of  reference,  not  the  non-inertial  body- 
fixed  frame.  The  subscript  space  is  added  to  emphasize  that  this  equation  is  written  in  the  inertial  space-fixed 
frame  of  reference.  However,  as  already  discussed,  it  is  much  more  convenient  to  transform  from  the  space- 
fixed  inertial  frame  to  the  body-fixed  frame  for  which  the  inertia  tensor  of  the  rigid  body  is  known.  Thus  the 
next  stage  is  to  express  the  rotational  motion  in  terms  of  the  body-fixed  frame  of  reference.  For  simplicity, 
translational  motion  will  be  ignored. 

The  rate  of  change  of  angular  momentum  can  be  written  in  terms  of  the  body-fixed  value,  using  the 
transformation  from  the  space-fixed  inertial  frame  (x,  y,  z)  to  the  rotating  frame  (ei,e2,e3)  as  given  in 
chapter  10.3, 


T uj  x L 


(11.99) 


However,  the  body  axis  e,  is  chosen  to  be  the  principal  axis  such  that 


Li  — liiOi 


(11.100) 


where  the  principal  moments  of  inertia  are  written  as  R.  Thus  the  equation  of  motion  can  be  written  using 
the  body-fixed  coordinate  system  as 


N 


Jlfiqei  + I2O2&2  + ^3^363  + 


el 

e2 

e3 

LOl 

U>2 

CU3 

hui 

R^>2 

-I3W3 

(11.101) 


(hdJ!  — (I2  — I3 ) W2W3)  §1  + (/2W2  — (I3  — R)  U3OJ1)  &2  + (/3W3  — (R  — I2)  W1W2)  §3(11.102) 
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where  the  components  in  the  body-fixed  axes  are  given  by 


= ROi  — (I2  — Is)  0)2U>3 

(11.103) 

N2 

= I2UI2  ~ {Is  ~ I\)  U3U1 

N3 

= I3O3  — {I\  — I2)  W1CU2 

These  are  the  Euler  equations  for  rigid  body  in  a force  field  expressed  in  the  body-fixed  coordinate 
frame.  They  are  applicable  for  any  applied  external  torque  N. 

The  motion  of  a rigid  body  depends  on  the  structure  of  the  body  only  via  the  three  principal  moments 
of  inertia  I1J2,  and  Is-  Thus  all  bodies  having  the  same  principal  moments  of  inertia  will  behave  exactly  the 
same  even  though  the  bodies  may  have  very  different  shapes.  As  discussed  earlier,  the  simplest  geometrical 
shape  of  a body  having  three  different  principal  moments  is  a homogeneous  ellipsoid.  Thus,  the  rigid-body 
motion  often  is  described  in  terms  of  the  equivalent  ellipsoid  that  has  the  same  principal  moments. 

A deficiency  of  Euler’s  equations  is  that  the  solutions  yield  the  time  variation  of  0 o as  seen  from  the  body- 
fixed  reference  frame  axes,  and  not  in  the  observers  fixed  inertial  coordinate  frame.  Similarly  the  components 
of  the  external  torques  in  the  Euler  equations  are  given  with  respect  to  the  body-fixed  axis  system  which 
implies  that  the  orientation  of  the  body  is  already  known.  Thus  for  non-zero  external  torques  the  problem 
cannot  be  solved  until  the  the  orientation  is  known  in  order  to  determine  the  components  Nfxt.  However, 
these  difficulties  disappear  when  the  external  torques  are  zero,  or  if  the  motion  of  the  body  is  known  and  it 
is  required  to  compute  the  applied  torques  necessary  to  produce  such  motion. 


11.18  Lagrange  equations  of  motion  for  rigid-body  rotation 


The  Euler  equations  of  motion  were  derived  using  Newtonian  concepts  of  torque  and  angular  momentum. 
It  is  of  interest  to  derive  the  equations  of  motion  using  Lagrangian  mechanics.  It  is  convenient  to  use  a 
generalized  torque  N and  assume  that  U = 0 in  the  Lagrange-Euler  equations.  Note  that  the  generalized 
force  is  a torque  since  the  corresponding  generalized  coordinate  is  an  angle,  and  the  conjugate  momentum 
is  angular  momentum.  If  the  body-fixed  frame  of  reference  is  chosen  to  be  the  principal  axes  system,  then, 
since  the  inertia  tensor  is  diagonal  in  the  principal  axis  frame,  the  kinetic  energy  is  given  in  terms  of  the 
principal  moments  of  inertia  as 

T=1-Yji,N-  (11.104) 

i 

Using  the  Euler  angles  as  generalized  coordinates,  then  the  Lagrange  equation  for  the  specific  case  of  the  if 
coordinate  and  including  a generalized  force  gives 


d_  dT  _dT_N 
dt  dip  dip  4' 


(11.105) 


which  can  be  expressed  as 


d_  JU  8T_  don  _ A dT_  duJi_  _ 
dt  duii  dip  . dui  dip  ^ 


(11.106) 


Equation  11.104  gives 


(11.107) 


Differentiating  the  angular  velocity  components  in  the  body-fixed  frame,  equations  (11.86  — 11.88) , give 


— f)  

= (p  sin  0 cos  ip  — 9 sin  ip  = 102 

dul  l du)2  Q 

dil)  dil) 

= —cp  sin  9 simp  — 9 cos  ip  = —ui 

diQ  \ do)  2 n 

dil)  dil) 

II 

O 

Out 3 _ 1 
dil> 

Substituting  these  into  the  Lagrange  equation  (11.106)  gives 


— /3W3  — + I2U2  (— aii)  = Ns 


(11.108) 
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since  the  if  and  e);  axes  are  colinear.  This  can  be  rewritten  as 


I3C3  — {1 1 — I2)  CO1CO2  = N3  (11.109) 

Any  axis  could  have  been  designated  the  e);  axis,  thus  the  above  equation  can  be  generalized  to  all  three 
axes  to  give 

Iiuji  — {1 2 — 1 3)  W2W3  = Ni  (11.110) 

/2W2  — (/3  — /l)  W3W1  = N-2 

I3O3  — (/1  — /2)  W1W2  = N3 

These  are  the  Euler’s  equations  given  previously  in  (11.103).  Note  that  although  C3  is  the  equation 
of  motion  for  the  if  coordinate,  this  is  not  true  for  the  <f>  and  6 rotations  which  are  not  along  the  body-fixed 
X\  and  X2  axes  as  given  in  table  11.1. 

11.8  Example:  Rotation  of  a dumbbell 

Consider  the  motion  of  the  symmetric  dumbbell  shown  in  the  adjacent  figure.  Let  |n|  = |r2|  = b.  Let  the 
body-fixed  coordinate  system  have  its  origin  at  O and  symmetry  axis  e)j  be  along  the  weightless  shaft  toward 
mi  and  va  = vae\.  The  angular  momentum  is  given  by 

L = ^2  mTt  x vi 


Because  L is  perpendicular  to  the  shaft,  and  L rotates  around  u as  the  shaft  rotates,  let  e2  be  along  L. 

L = Z/2e2 


If  a is  the  angle  between  u > and  the  shaft,  the  components  of  w 
are 

uj\  = 0 

a>2  = u)  sin  a 

0)3  = oj  cos  a 

Assume  that  the  principal  moments  of  the  dumbbell  are 

h = ( mi  + m2)  b2 

12  = (mi  + m2)  b2 

1 3 = 0 

Thus  the  angular  momentum  is  given  by 
L\  = IiW\  = 0 

L2  = I2OJ2  = (mi  + m2)  b2ui  sin  a 
L3  = I3LO3  = 0 


which  is  consistent  with  the  angular  momentum  being  along  the  axis. 

Using  Eider’s  equations,  and  assuming  that  the  angidar  velocity  is  constant,  i.e.  ui  = 0,  then  the  compo- 
nents of  the  torque  required  to  satisfy  this  motion  are 

Ni  = — (mi  + m2)  b2u}2  sin  a cos  a 

N2  = 0 
N3  = 0 


That  is,  this  motion  can  only  occur  in  the  presence  of  the  above  applied  torque  which  is  in  the  direction 
— ef,  that  is,  mutually  perpendicidar  to  C2  and  €3  . This  torque  can  be  written  as  N = to  x L. 
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11.19  Hamiltonian  equations  of  motion  for  rigid-body  rotation 

The  Hamiltonian  equations  of  motion  are  expressed  in  terms  of  the  Euler  angles  plus  their  corresponding 
canonical  angular  momenta  (fi,  9,  in  contrast  to  Lagrangian  mechanics  which  is  based  on  the 

Euler  angles  plus  their  corresponding  angular  velocities  The  Hamiltonian  approach  is  con- 

veniently expressed  in  terms  of  a set  of  Andoyer-Deprit  action-angle  coordinates  that  include  the  three  Euler 
angles,  specifying  the  orientation  of  the  body-fixed  frame,  plus  the  corresponding  three  angles  specifying  the 
orientation  of  the  spin  frame  of  reference.  This  phase  space  approach[Dep67]  can  be  employed  for  calcu- 
lations of  rotational  motion  in  celestial  mechanics  that  can  include  spin-orbit  coupling.  This  Hamiltonian 
approach  is  beyond  the  scope  of  the  present  textbook. 

11.20  Torque- free  rotation  of  an  inert ially-symmetric  rigid  rotor 

11.20.1  Euler’s  equations  of  motion: 

There  are  many  situations  where  one  has  rigid-body  motion  free 
of  external  torques,  that  is,  N = 0.  The  tumbling  motion  of  a 
jugglers  baton,  a diver,  a rotating  galaxy,  or  a frisbee,  are  exam- 
ples of  rigid-body  rotation.  For  torque-free  rotation,  the  body 
will  rotate  about  the  center  of  mass,  and  thus  the  inertia  tensor 
with  respect  to  the  center  of  mass  is  required.  An  inertially- 
synnnetric  rigid  body  has  two  identical  principal  moments  of 
inertia  with  R = R 7^/3,  and  provides  a simple  example  that 
illustrates  the  underlying  motion.  The  force-free  Euler  equations 
for  the  symmetric  body  in  the  body-fixed  principal  axis  system 
are  given  by 

{R  — I3)  W2CU3  — I id)  1 = 0 (11.111) 

{R  R)  CU3CU1  Ruj2  = 0 (11.112) 

Rco3  = 0 (11.113) 

where  R = R and  N = 0 apply. 

Note  that  for  torque-free  motion  of  an  inertially  symmetric 
body  equation  11.113  implies  that  ui3  = 0,  i.e.  w 3 is  a constant 
of  motion  and  thus  is  a cyclic  variable  for  the  symmetric  rigid 
body. 

Equations  11.111  and  11.112  can  be  written  as  two  coupled 
equations 

W!+QW2  = 0 (11.114) 

1 = 0 (11.115) 

where  the  precession  angular  velocity  ft  with  respect  to  the  body-fixed  frame  is  defined  to  be 

» = ((/3~Jl)  o,3)  (11.116) 

Combining  the  time  derivatives  of  equations  11.114  and  11.115  leads  to  two  uncoupled  equations 

dq+flV  = 0 (11.117) 

Q2  + O2L02  = 0 (11.118) 

These  are  the  differential  equations  for  a harmonic  oscillator  with  solutions 


3 


Figure  11.4:  The  force-free  symmetric  top 
angular  velocity  u precesses  on  a conical 
trajectory  about  the  body-fixed  symme- 
try axis  3. 


U)l 

n>2 


A cos  fit 
A sin  fit. 


(11.119) 
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These  equations  describe  a vector  A rotating  in  a circle  of  radius  A about  an  axis  perpendicular  to  e3,  that 
is,  rotating  in  the  ei  — e2  plane  with  angular  frequency  f l = —•ip.  Note  that 

luI+ojI  = A2  (11.120) 

which  is  a constant.  In  addition  oj3  is  constant,  therefore  the  magnitude  of  the  total  angular  velocity 

|u>|  = \Jw\  + uj\  + W3  = constant  (11.121) 

The  motion  of  the  torque-free  symmetric  body  is  that  the  angular  velocity  u>  precesses  around  the 
symmetry  axis  £3  of  the  body  at  an  angle  a with  a constant  precession  frequency  U with  respect  to  the 
body-fixed  frame  as  shown  in  figure  11.4.  Thus,  to  an  observer  on  the  body,  uj  traces  out  a cone  around  the 
body-fixed  symmetry  axis.  Note  from  (11.116)  that  the  vectors  fle3  and  0^363  are  parallel  when  0 is  positive, 
that  is,  I3  > I (oblate  shape)  and  antiparallel  if  I3  < I (prolate  shape). 

For  the  system  considered,  the  orientation  of  the  angular  momentum  vector  L must  be  stationary  in  the 
space-fixed  inertial  frame  since  the  system  is  torque  free,  that  is,  L is  a constant  of  motion.  Also  we  have 
that  the  projection  of  the  angular  momentum  on  the  body-fixed  symmetry  axis  is  a constant  of  motion,  that 
is,  it  is  a cyclic  variable.  Thus 

L3  = I3UJ3  = hh  O (11.122) 

\h  - 1 1) 

Understanding  the  relation  between  the  angular  momentum  and  angular  velocity  is  facilitated  by  consid- 
ering another  constant  of  motion  for  the  torque-free  symmetric  rotor,  namely  the  rotational  kinetic  energy. 


—uj  • L = constant 
2 


(11.123) 


Since  L is  a constant  for  torque- free  motion,  and  also  the  magnitude  of  uj  was  shown  to  be  constant,  therefore 
the  angle  between  these  two  vectors  must  be  a constant  to  ensure  that  also  Trot  = \uj  ■ L = constant.  That 
is,  uj  precesses  around  L at  a constant  angle  (9  — a)  such  that  the  projection  of  uj  onto  L is  constant.  Note 
that 

uj  x <33  = tu2ei  — uqe 2 (11.124) 

and,  for  a symmetric  rotor, 

L • uj  x e3  = I\(jj\i02  — /2uq(u2  = 0 (11.125) 

since  I\  = /2  for  the  symmetric  rotor.  Because  L • uj  x e~3  = 0 for  a symmetric  top  then  L,uj  and  63  are 
coplanar. 

Figure  11.5  shows  the  geometry  of  the  motion  for  both  oblate  and  prolate  axially-deformed  bodies.  To 
an  observer  in  the  space-fixed  inertial  frame,  the  angular  velocity  uj  traces  out  a cone  that  precesses  with 
angular  velocity  fl  around  the  space  fixed  L axis  called  the  space  cone.  For  convenience,  figure  11.5  assumes 
that  L and  the  space-fixed  inertial  frame  z axis  are  colinear.  The  angular  velocity  uj  also  traces  out  the 
body  cone  as  it  precesses  about  the  body-fixed  e3  axis.  Since  L,  uj  and  ej;  are  coplanar,  then  the  uj  vector  is 
at  the  intersection  of  the  space  and  body  cones  as  the  body  cone  rolls  around  the  space  cone.  That  is,  the 
space  and  body  cones  have  one  generatrix  in  common  which  coincides  with  uj.  As  shown  in  figure  11.56,  for 
a needle  the  body  cone  appears  to  roll  without  slipping  on  the  outside  of  the  space  cone  at  the  precessional 
velocity  of  U = —uj.  By  contrast,  as  shown  in  figure  11.5a  for  an  oblate  (disc-shaped)  symmetric  top  the 
space  cone  rolls  inside  the  body  cone  and  the  precession  Cl  is  faster  than  uj. 

Since  no  external  torques  are  acting  for  torque-free  motion,  then  the  magnitude  and  direction  of  the  total 
angular  momentum  are  conserved.  The  description  of  the  motion  is  simplified  if  L is  taken  to  be  along  the 
space-fixed  z axis,  then  the  Euler  angle  9 is  the  angle  between  the  body-fixed  basis  vector  e3  and  space-fixed 
basis  vector  z.  If  at  some  instant  in  the  body  frame,  it  is  assumed  that  is  aligned  in  the  plane  of  L ,u) 
and  ejj,  then 

Li  = 0 L2=Tsin0  L3  = Lcos9  (11.126) 

If  a is  the  angle  between  the  angular  velocity  cj  and  the  body-fixed  e3  axis,  then  at  the  same  instant 


u)i  = 0 


u)2  = ui  sm  a 


W3  = uj  cos  a 


(11.127) 


11.20.  TORQUE-FREE  ROTATION  OF  AN  INERTIALLY-SYMMETRIC  RIGID  ROTOR 


317 
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Figure  11.5:  Torque-free  rotation  of  symmetric  tops;  (a)  circular  flat  disk,  (b)  circular  rod.  The  space-fixed 
and  body-fixed  cones  are  shown  by  fine  lines.  The  space-fixed  axis  system  is  designated  by  the  unit  vectors 
(x,  y,  z)  and  the  body-fixed  principal  axis  system  by  unit  vectors  (1,  2,  3). 


The  components  of  the  angular  momentum  also  can  be  derived  from  L = I • u>  to  give 


L\  = I\uj\  = 0 Z/2  = I2W2  = hwsma  L3  = I30J3  = I3U  cos  a (11.128) 

Equations  11.126  and  11.128  give  two  relations  for  the  ratio  j=,  that  is, 

= tan  0 = tan  a (11.129) 

For  a prolate  spheroid  I\  > I3  therefore  6 > a while  fl  and  U3  have  opposite  signs. 

For  a oblate  spheroid  I\  < I3  therefore  a > 9 while  fi  and  uj 3 have  the  same  sign. 

The  sense  of  precession  can  be  understood  if  the  body  cone  rolls  without  slipping  on  the  outside  of  the 
space  cone  with  fl  in  the  opposite  orientation  to  ui  for  the  prolate  case,  while  for  the  oblate  case  the  space 
cone  rolls  inside  the  body  cone  with  and  ui  oriented  in  similar  directions.  Note  from  (11.129)  that  9 = 0 
if  a = 0,  that  is  L,  u>  and  the  3 axis  are  aligned  corresponding  to  a principal  axis.  Similarly,  9 = 90°  if 
a = 90°,  then  again  L and  co  are  aligned  corresponding  to  them  being  principal  axes. 

Lagrangian  mechanics  has  been  used  to  calculate  the  motion  with  respect  to  the  body-fixed  principal 
axis  system.  However,  the  motion  needs  to  be  known  relative  to  the  space-fixed  inertial  frame  where  the 
motion  is  observed.  This  transformation  can  be  done  using  the  following  relation 


de 3 
dt 


-(t) 


+ u x 63  = w x e3 


(11.130) 


body 


since  the  unit  vector  §3  is  stationary  in  the  body-fixed  frame.  The  vector  product  of  u x S3  and  S3  gives 

e3  x ( = e3  x uj  x e3  = (e3  • e3)  u - (e3  • u>)  e3  = u - w3e3 

V / space 


therefore 


u = e3X 


de 3 
dt 


W3e3 


(11.131) 


The  angular  momentum  equals  L ={1}  -lo.  Since  S3  x (^)  space  is  perpendicular  to  the  S3  axis,  then 
for  the  case  with  I\  = I2, 

/ /7a..  \ 

/3w3e3  (11.132) 


T T - I 

L =he3  x — — 


dt 


space 
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Thus  the  angular  momentum  for  a torque-free  symmetric  rigid  rotor  comprises  two  components,  one  being 
the  perpendicular  component  that  precesses  around  §3,  and  the  other  is  L3. 

In  the  space-fixed  frame  assume  that  the  z axis  is  colinear  with  L.  Then  taking  the  scalar  product  of  §3 
and  L,  using  equation  11.126  gives 


La 


e3  • L =/ie3  • e3  x 


+ ^3^363  • <23 


The  first  term  on  the  right  is  zero  and  thus  equation  11.133  and  11.126  give 


L3  = I3uj3  = L cos  9 


(11.133) 


(11.134) 


The  time  dependence  of  the  rotation  of  the  body-fixed  symmetry  axis  with  respect  to  the  space-fixed 
axis  system  can  be  obtained  by  taking  the  vector  product  e3  x L using  equation  11.132  and  using  equation 
B. 24  to  expand  the  triple  vector  product, 


T t ~ I - , de 3 

e3  x L = /ie3  x I e3  x ( 


+ 13^363  X ©3 


(11.135) 


= h 


de3 


I S3  — (e3  • e3) 

/ space / 


( 

\ dt 


space 


since  (e3  x e3)  = 0.  Moreover  (S3  • e3)  = 1,  and  S3  • (fo)s  ace  = 0,  since  they  are  perpendicular,  then 


de 3 
dt 


space 


= -r  x e3 
li 


(11.136) 


This  equation  shows  that  the  body-fixed  symmetry  axis  e3  precesses  around  the  L,  where  L is  a constant 
of  motion  for  torque-free  rotation.  The  true  rotational  angular  velocity  us  in  the  space-fixed  frame,  given  by 
equations  11.131,  can  be  evaluated  using  equation  11.136.  Remembering  that  it  was  assumed  that  L is  in 
the  z direction,  that  is,  L =Lz,  then 


U) 


' de  3 

e3  x ( -7T  1 + w3e3 


dt 


space 


— e3  x (z  x e3)  + 

ri 


L cos  a 


e3 


L~  r 

—z  + L cos  a 

h 


h ~ h 
hh 


e3 


(11.137) 


That  is,  the  symmetry  axis  of  the  axially-symmetric  rigid  rotor  makes  an  angle  9 to  the  angular  momentum 
vector  Lz  and  precesses  around  Lz  with  a constant  angular  velocity  while  the  axial  spin  of  the  rigid  body 
has  a constant  value  j^.  Thus,  in  the  processing  frame,  the  rigid  body  appears  to  rotate  about  its  fixed 
symmetry  axis  with  a constant  angular  velocity  L c7°s  a — L c7° s a = Lcosa  ( IlI^/33 1 ■ The  precession  of  the 
symmetry  axis  looks  like  a wobble  superimposed  on  the  spinning  motion  about  the  body-fixed  symmetry 
axis.  The  angular  precession  rate  in  the  space-fixed  frame  can  be  deduced  by  using  the  fact  that 


cj)  sin  9 = co  sin  a 


(11.138) 


Then  using  equation  11.129  allows  equation  11.138  to  be  written  as 


<f>  = u> 

which  gives  the  precession  rate  about  the  space-fixed  axis  in  terms  of  the  angular  velocity  ui.  Note  that  the 
precession  rate  <j>  > u>  if  ^ > 1,  that  is,  for  oblate  shapes,  and  <j>  <oj  if  -^  < 1,  that  is,  for  prolate  shapes. 


- 1 


(11.139) 
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11.20.2  Lagrange  equations  of  motion: 

It  is  interesting  to  compare  the  equations  of  motion  for  torque-free  rotation  of  an  inertially-symmetric 
rigid  rotor  derived  using  Lagrange  mechanics  with  that  derived  previously  using  Euler’s  equations  based  on 
Newtonian  mechanics.  Assume  that  the  principal  moments  about  the  fixed  point  of  the  symmetric  top  are 
Ji  = I-2  7^  1.3  and  that  the  kinetic  energy  equals  the  rotational  kinetic  energy,  that  is,  it  is  assumed  that  the 
translational  kinetic  energy  Ttrans  = 0.  Then  the  kinetic  energy  is  given  by 

T — 2 ^ iUJi  = o^1  + wi)  + (11.140) 

i 

Equations  (11.86  — 88)  for  the  body-fixed  frame  give 

uii  = (cpsindsinip  + 9cosip'\  = (f>  sin2  6 sin2  if;  + 2<p0  sin  9 sin  ip  cos  ip  + 6 cos2  ip  (11.141) 


(11.152) 
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Since  p $ and  p ^ are  constants  of  motion,  then  the  precessional  angular  velocity  <j>  about  the  space-fixed  z 
axis,  and  the  spin  angular  velocity  if,  which  is  the  spin  frequency  about  the  body-fixed  3 axis,  are  constants 
that  depend  directly  on  R , Is . and  9. 

There  is  one  additional  constant  of  motion  available  if  no  dissipative  forces  act  on  the  system,  that  is, 
energy  conservation  which  implies  that  the  total  energy 


(11.153) 


will  be  a constant  of  motion.  But  the  second  term  on  the  right-hand  side  also  is  a constant  of  motion  since 
and  Is  both  are  constants,  that  is 


I/3W2  = I/3  (fcos  9 + ifj  = y~  = constant  (11.154) 

Thus  energy  conservation  implies  that  the  first  term  on  the  right-hand  side  also  must  be  a constant  given  by 

2 

l/i  (w2  + ujf)  = ( <p2  sin2  9 + 92)  = E — ^ = constant  (11.155) 

These  results  are  identical  to  those  given  in  equations  11.120  and  11.121  which  were  derived  using  Euler’s 
equations.  These  results  illustrate  that  the  underlying  physics  of  the  torque-free  rigid  rotor  is  more  easily 
extracted  using  Lagrangian  mechanics  rather  than  using  the  Euler-angle  approach  of  Newtonian  mechanics. 


11.9  Example:  Precession  rate  for  torque-free  rotating  symmetric  rigid  rotor 

Table  11.2  lists  the  precession  and  spin  angular  velocities,  in  the  space- fixed  frame,  for  torque-free  rotation 
of  three  extreme  symmetric-top  geometries  spinning  with  constant  angular  momentum  ui  when  the  motion 
is  slightly  perturbed  such  that  oj  is  at  a small  angle  a to  the  symmetry  axis.  Note  that  this  assumes  the 
perpendicular  axis  theorem,  equation  11.45  which  states  that  for  a thin  laminae  I\  + 12  = I3  giving,  for  a 
thin  circular  disk,  I\  = I2  and  thus  I3  = 2R. 


Table  11.2:  Precession  and  spin  rates  for  torque-free  axial  rotation  of  symmetric  rigid  rotors 


Rigid-body  symmetric  shape 

Principal  moment  ratio  77 

Precession  rate  (f> 

Spin  rate  if 

Symmetric  needle 

0 

0 

OJ 

Sphere 

1 

OJ 

0 

Thin  circular  disk 

2 

2oj 

—OJ 

The  precession  angular  velocity  in  the  space  frame  ranges  between  0 to  2c j depending  on  whether  the 
body-fixed  spin  angular  velocity  is  aligned  or  anti-aligned  with  the  rotational  frequency  oj.  For  an  extreme 
prolate  spheroid  77  = 0,  the  body-fixed  spin  angular  velocity  It  = —0)3  which  cancels  the  angular  velocity 
c 0 of  the  rotating  frame  resulting  in  a zero  precession  angular  velocity  of  the  body-fixed  e3  axis  around  the 
space-fixed  frame.  The  spin  S2  = 0 in  the  body-fixed  frame  for  the  rigid  sphere  77  = 1,  and  thus  the  precession 
rate  of  the  body-fixed  e 3 axis  of  the  sphere  around  the  space-fixed  frame  equals  oj.  For  oblate  spheroids  and 
thin  disks,  such  as  a frisbee,  77  = 2 making  the  body-fixed  precession  angular  velocity  0 = +01  which  adds 
to  the  angular  velocity  oj  and  increases  the  precession  rate  up  to  2 oj  as  seen  in  the  space-fixed  frame.  This 
illustrates  that  the  spin  angular  velocity  can  add  constructively  or  destructively  with  the  angular  velocity  oj.2 


2 In  his  autobiography  Surely  You’re  Joking  Mr  Feynman,  he  wrote  " I was  in  the  [Cornell]  cafeteria  and  some  guy,  fooling 
around,  throws  a plate  in  the  air.  As  the  plate  went  up  in  the  air  I saw  it  wobble,  and  noticed  that  the  red  medallion  of 
Cornell  on  the  plate  going  around.  It  was  pretty  obvious  to  me  that  the  medallion  went  around  faster  than  the  wobbling.  I 
started  to  figure  out  the  motion  of  the  rotating  plate.  I discovered  that  when  the  angle  is  very  slight,  the  medallion  rotates 
twice  as  fast  as  the  wobble  rate.  It  came  out  of  a very  complicated  equation!  ".  The  quoted  ratio  (2  : 1)  is  incorrect,  it  should 
be  (1  : 2).  Benjamin  Chao  in  Physics  Today  of  February  1989  speculated  that  Feynman’s  error  in  inverting  the  factor  of 
two  might  be  "in  keeping  with  the  spirit  of  the  author  and  the  book,  another  practical  joke  meant  for  those  who  do  physics 
without  experimenting".  He  pointed  out  that  this  story  occurred  on  page  157  of  a book  of  length  314  pages  (1:2).  Observe  the 
dependence  of  the  ratio  of  wobble  to  rotation  angular  velocities  on  the  tilt  angle  0. 
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11.21  Torque-free  rotation  of  an  asymmetric  rigid  rotor 


The  Euler  equations  of  motion  for  the  case  of  torque-free  rotation  of  an  asymmetric  (triaxial)  rigid  rotor 
about  the  center  of  mass,  with  principal  moments  of  inertia  I\  I2  7^  I3,  lead  to  more  complicated  motion 
than  for  the  symmetric  rigid  rotor.3  The  general  features  of  the  motion  of  the  asymmetric  rotor  can  be 
deduced  using  the  conservation  of  angular  momentum  and  rotational  kinetic  energy. 

Assuming  that  the  external  torques  are  zero  then  the  Euler 
equations  of  motion  can  be  written  as 


I\LOl 

= (-I2  ^ I3)  td2td3 

(11.156) 

I2O2 

= [I3  — I\)  U3U1 

^3^3 

= (I1  — I2)  U1UJ2 

Since  L,;  = I-ioj-i  for  i = 1,2, 3,  then  equation  11.156  gives 

I2I3L1  = {1 2 ~ I3)  L2L3  (11.157) 

I\  I3L2  = (I3  — I\)  L3L1 

I\  I2L3  = (fi  — I2)  L1L2 

Multiply  the  first  equation  by  I\L\,  the  second  by  I2L2  and  the 
third  by  I3L3  and  sum,  which  gives 

RI2I3  (L\L\  + L2L2  + ^3X3^  = 0 (11.158) 


The  bracket  is  equivalent  to  ft{L\  + L\  + L2)  = 0 which  implies  Figure  1L6:  Rotati0n  of  an  asymmetric 
that  the  total  rotational  angular  momentum  L is  a constant  of  rigid  rotor  The  dark  lines  correSp0nd  to 
motion  as  expected  for  this  torque-free  system,  even  though  the  CQntours  of  constant  total  rotational  ki- 
individual  components  LltL2,  i3  may  vary.  That  is  netic  energy  T,  which  has  an  ellipsoidal 

L\-\-  L2  + L2  = L2  (11  159)  shape,  projected  onto  the  angular  momen- 

tum L sphere  in  the  body-fixed  frame. 

Note  that  equation  11.159  is  the  equation  of  a sphere  of  radius  L. 

Multiply  the  first  equation  of  11.157  by  L 1,  the  second  by  L2,  and  the  third  by  L3,  and  sum  gives 


I2I3L1L1  + I1I3L2L2  + I1I2L3L3  = 0 


(11.160) 


Divide  11.160  by  RQh  gives 
T,  given  by 


^ + 77-)  = 0.  This  implies  that  the  total  rotational  kinetic  energy 


L2  L2  L2 

±_  1 Z I O rt-i 

2 h 2 12  2/3  “ 


(11.161) 


is  a constant  of  motion  as  expected  when  there  are  no  external  torques  and  zero  energy  dissipation.  Note 
that  11.161  is  the  equation  of  an  ellipsoid. 

Equations  11.159  and  11.161  both  must  be  satisfied  by  the  rotational  motion  for  any  value  of  the  total 
angular  momentum  L and  kinetic  energy  T.  Fig  11.6  shows  a graphical  representation  of  the  intersection  of 
the  L sphere  and  T ellipsoid  as  seen  in  the  body-fixed  frame.  The  angular  momentum  vector  L must  follow 
the  constant-energy  contours  given  by  where  the  T-ellipsoids  intersect  the  T-sphere,  shown  for  the  case  where 
I3  > I2  > h-  Note  that  the  precession  of  the  angular  momentum  vector  L follows  a trajectory  that  has 
closed  paths  that  circle  around  the  principal  axis  with  the  smallest  /,  that  is,  ei,  or  the  principal  axis  with 
the  maximum  J,  that  is,  §3.  However,  the  angular  momentum  vector  does  not  have  a stable  minimum  for 
precession  around  the  intermediate  principal  moment  of  inertia  axis  e-2.  In  addition  to  the  precession,  the 
angular  momentum  vector  L executes  nutation,  that  is  a nodding  of  the  angle  9. 

For  any  fixed  value  of  L , the  kinetic  energy  has  upper  and  lower  bounds  given  by 


I?  <t<l2 

2/3  2/i 


(11.162) 


3 Similar  discussions  of  the  freely-rotating  asymmetric  top  are  given  by  Landau  and  Lifshitz  [La60]  and  by  Gregory  [Gr06]. 
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T 2 

Thus,  for  a given  value  of  T,  when  T = Tmjn  = the  orientation  of  L in  the  body-fixed  frame  is  either 
(0, 0,  +L)  or  (0,  0,  —L),  that  is,  aligned  with  the  e3  axis  along  which  the  principal  moment  of  inertia  is  largest. 
For  slightly  higher  kinetic  energy  the  trajectory  of  L follows  closed  paths  precessing  around  e3.  When  the 

l 2 

kinetic  energy  T = ^ the  angular  momentum  vector  L follows  either  of  the  two  thin-line  trajectories  each 
of  which  are  a separatrix.  These  do  not  have  closed  orbits  around  e2  and  they  separate  the  closed  solutions 
around  either  e3  or  e3.  For  higher  kinetic  energy  the  precessing  angular  momentum  vector  follows  closed 
trajectories  around  e3  and  becomes  fully  aligned  with  e3  at  the  upper-bound  kinetic  energy. 

Note  that  for  the  special  case  when  /3  > J2  = Ji,  then  the  asymmetric  rigid  rotor  equals  the  symmetric 
rigid  rotor  for  which  the  solutions  of  Euler’s  equations  were  solved  exactly  in  chapter  11.19.  For  the  symmetric 
rigid  rotor  the  T-ellipsoid  becomes  a spheroid  aligned  with  the  symmetry  axis  and  thus  the  intersections 
with  the  L-sphere  lead  to  circular  paths  around  the  e3  body-fixed  principal  axis,  while  the  separatrix  circles 
the  equator  corresponding  to  the  e3  axis  separating  clockwise  and  anticlockwise  precession  about  L3.  This 
discussion  shows  that  energy,  plus  angular  momentum  conservation,  provide  the  general  features  of  the 
solution  for  the  torque-free  symmetric  top  that  are  in  agreement  with  those  derived  using  Euler’s  equations 
of  motion 


11.22  Stability  of  torque-free  rotation  of  an  asymmetric  body 

It  is  of  interest  to  extend  the  prior  discussion  to  address  the  stability  of  an  asymmetric  rigid  rotor  undergoing 
force-free  rotation  close  to  a principal  axes,  that  is,  when  subject  to  small  perturbations.  Consider  the  case 
of  a general  asymmetric  rigid  body  with  73  > /2  > I\.  Let  the  system  start  with  rotation  about  the  e3  axis, 
that  is,  the  principal  axis  associated  with  the  moment  of  inertia  I\ . Then 

=aqei  (11.163) 

Consider  that  a small  perturbation  is  applied  causing  the  angular  velocity  vector  to  be 

lj  =wiei  + Ae2  + p,e3  (11.164) 

where  A,p,  are  very  small.  The  Euler  equations  (11.156)  become 

(Ii  — I3)  A/.i  — I\uJi  = 0 
(-I3  — h)  Mwi  ~ -^2A  = 0 
(h  — 1 2)  wiA  — J3/i  = 0 

Assuming  that  the  product  A p,  in  the  first  equation  is  negligible,  then  u>\  = 0,  that  is,  u)\  is  constant. 

The  other  two  equations  can  be  solved  to  give 

A = n (11.165) 


(11.166) 

(11.167) 


and  substitute  for  /t  gives 

The  solution  of  this  equation  is 
where 


A + 


/(Ji-I3)(Ii-I2)  2\ 

l hh  V 


A = 0 


A (t)  = Aeinixt  + Be~iQlxt 
fllA  = lull 


(11.168) 

(11.169) 

(11.170) 


11.22.  STABILITY  OF  TORQUE-FREE  ROTATION  OF  AN  ASYMMETRIC  BODY 


323 


Note  that  since  it  was  assumed  that  I3  > I2  > Ii,  then  Q-|  a is  real.  The  solution  for  A (t)  therefore  represents  a 
stable  oscillatory  motion  with  precession  frequency  Oia-  The  identical  result  is  obtained  for  = IRa  = fR. 
Thus  the  motion  corresponds  to  a stable  minimum  about  the  ei  axis  with  oscillations  about  the  A = p,  = 0 
minimum  with  period. 


wii 


<(h-h)  (h-h) 


hh 


(11.171) 


Permuting  the  indices  gives  that  for  perturbations  applied  to  rotation  about  either  the  2 or  3 axes  give 
precession  frequencies 


Ll2  — Ul2  1 

H3  = W31 


'(12-/1H/2-/3) 

hh 

1 {h  - h)  (h  - h) 

hh 


(11.172) 

(11.173) 


Since  h > h > h then  fR  and  H3  are  real  while  f^2  is  imaginary.  Thus,  whereas  rotation  about  either 
the  I3  or  the  h axes  are  stable,  the  imaginary  solution  about  §2  corresponds  to  a perturbation  increasing 
with  time.  Thus,  only  rotation  about  the  largest  or  smallest  moments  of  inertia  are  stable.  Moreover  for 
the  symmetric  rigid  rotor,  with  h = h h,  stability  exists  only  about  the  symmetry  axis  e3  independent 
on  whether  the  body  is  prolate  or  oblate.  This  result  was  implied  from  the  use  of  energy  and  angular 
momentum  conservation  in  chapter  11.20.  Friction  was  not  included  in  the  above  discussion.  In  the  presence 
of  dissipative  forces,  such  as  friction  or  drag,  only  rotation  about  the  principal  axis  corresponding  to  the 
maximum  moment  of  inertia  is  stable. 

Stability  of  rigid-body  rotation  has  broad  applications  to  rotation  of  satellites,  molecules  and  nuclei. 
The  first  U.S.  satellite,  Explorer  1,  was  launched  in  1958  with  the  rotation  axis  aligned  with  the  cylindrical 
axis  which  was  the  minimum  principal  moment  of  inertia.  After  a few  hours  the  satellite  started  tumbling 
with  increasing  amplitude  due  to  a flexible  antenna  dissipating  and  transferring  energy  to  the  perpendicular 
axis  which  had  the  largest  moment  of  inertia.  Torque-free  motion  of  a deformed  rigid  body  is  a ubiquitous 
phenomena  in  many  branches  of  science,  engineering,  and  sports  as  illustrated  by  the  following  examples. 


11.10  Example:  Tennis  racquet  dynamics 


A tennis  racquet  is  an  asymmetric  body  that  exhibits  the  above  rota- 
tional behavior.  Assume  that  the  head  of  a tennis  racquet  is  a uniform 
thin  circular  disk  of  radius  R and  mass  M ivhich  is  attached  to  a cylin- 
drical handle  of  diameter  r = j length  2 R,  and  mass  M as  shown  in 
the  figure.  The  principle  moments  of  inertia  about  the  three  axes  through 
the  center- of-mass  can  be  calculated  by  addition  of  the  moments  for  the 
circular  disk  and  the  cylindrical  handle  and  using  both  the  parallel-axis 
and  the  perpendicular- axis  theorems. 


Axis  Head  Handle 

1 \MR2+MR2=\MR2  § MR2 

2 \MR2+0  =\MR2  MR2 

3 \MR2+MR2=\MR2  § MR2 


Racquet 
f \MR2 

mMR 2 

^ MR 2 
6 


Note  that  In  : I22  ■ I33  = 2.5833  : 0.2550  : 2.8333.  Inserting  these 
principle  moments  of  inertia  into  equations  11.171  — 11.173  gives  the 
following  precession  frequencies 


/?!=  i0.8976oj1  n2=  0.9056cu2  H3=  0.9892u3 


Principal  rotation  axes  for  the 
center  of  mass  of  a tennis  racket. 

The  1 and  2 -axes  are  in  the 
plane  of  the  racket  head  and  the 
3 axis  is  perpendicular  to  the 
plane  of  the  racket  head. 


The  imaginary  precession  frequency  Hi  about  the  1 axis  implies  unstable  rotation  leading  to  tumbling 
whereas  the  minimum  moment  I22  and  maximum  moment  I33  imply  stable  rotation  about  the  2 and  3 axes. 
This  rotational  behavior  is  easily  demonstrated  by  throwing  a tennis  racquet  and  is  called  the  tennis  racquet 
theorem.  The  center  of  percussion,  example  2.14,  also  is  an  important  inertial  property  of  a tennis  racquet. 
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11.11  Example:  Rotation  of  a, syrnrnetrica, l ly- deformed  nuclei 

Some  nuclei  and  molecules  have  average  shapes  that  have  significant  asymmetric  deformation  leading  to 
interesting  quantal  analogs  of  the  rotational  properties  of  an  asymmetrically-deformed  rigid  body.  The  major 
difference  between  a quantal  and  a classical  rotor  is  that  the  energies,  and  angidar  momentum  are  quantized, 
rather  than  being  continuously  variable  quantities.  Otherwise,  the  quantal  rotors  exhibit  general  features 
similar  to  the  classical  analog.  Studies  [Cli86]  of  the  rotational  behavior  of  asymmetrically-deformed  nuclei 
exploit  three  aspects  of  classical  mechanics,  namely  classical  Coulomb  trajectories,  rotational  invariants,  and 
the  properties  of  ellipsoidal  rigid-bodies. 

Ellipsoidal  deformation  can  be  specified  by  the  dimensions  along  each  of  the  three  principle  axes.  Bohr 
and  Mottelson  parameterized  the  ellipsoidal  deformation  in  terms  of  three  parameters,  Rq  which  is  the  radius 
of  the  equivalent  sphere,  (3  which  is  a measure  of  the  magnitude  of  the  ellipsoidal  deformation  from  the  sphere, 
and  7 which  specifies  the  deviation  of  the  shape  from  axial  symmetry.  The  ellipsoidal  intrinsic  shape  can  be 
expressed  in  terms  of  the  deviation  from  the  equivalent  sphere  by  the  equation 

M+2 

5R(9,  fi)  = R(9,  fi)  - R0  = R0  ^ , <P)  (a) 

M=— 2 


where  Y\^(9,(p)  is  a Laplace  spherical  harmonic  defined  as 


and  P\t_l(cos9 ) is  an  associated  Legendre  function  of  cos  9.  Spherical  harmonics  are  the  angular  portion  of  a 
set  of  solutions  to  Laplace ’s  equation.  Represented  in  a system  of  spherical  coordinates,  Laplace ’s  spherical 
harmonics  Y\/J/(9,(p)  are  a specific  set  of  spherical  harmonics  that  form  an  orthogonal  system.  Spherical 
harmonics  are  important  in  many  theoretical  and  practical  applications. 

In  the  principal  axis  frame  of  the  body,  there  are  three  non-zero  quadrupole  deformation  parameters 
which  can  be  written  in  terms  of  the  deformation  parameters  (3, 7 where  a-2o  = f3  cos  7,  0:21  = 02-1  = 0,  and 
®22  = 012-2  = -j^/3  sin  7.  Using  these  in  equations  (a)  give  the  three  semi-axis  dimensions  in  the  principal 
axis  frame,  (primed  frame), 

I ^ tc 

SRx  = \j^RoPco  s(7-— ) (b) 


while  6R3  = +J^Ro(3,  that  is  the  body  has  prolate 
The  same  prolate  shape  is  obtained  for  7 = ^ 


2 K and 


with  the  prolate  symmetry  axes  along  the  1 and  2 axes  respectively.  For  7 = f then  SRi  = SR3  = 


Note  that  for  7 = 0,  then  5Ri  = SR2  = 
deformation  with  the  symmetry  axis  along  the  3 axis. 

7 _ 

+ \\J^Ro(3  while  SR2  = — \J j^Ro(3,  that  is  the  body  has  oblate  deformation  with  the  symmetry  axis  along 

the  2 axis.  The  same  oblate  shape  is  obtained  for  7 = 7 r and  7 = with  the  oblate  symmetry  axes  along 
the  3 and  1 axes  respectively.  For  other  values  of  7 the  shape  is  ellipsoidal. 

For  the  asymmetric  deformed  rigid  body,  the  rotational  Hamiltonian  can  be  expressed  in  the  form[Dav58] 


" = E 


l-«l2 

4 B(32  sin2  (7' 


2-7T  k \ 
3 ) 


where  the  rotational  angular  momentum  is  R.  The  principal  moments  of  inertia  are  related  by  the  triaxiality 
parameter  7'  which  they  assumed  is  identical  to  the  shape  parameter  7.  For  axial  symmetry  the  moment  of 
inertia  about  the  symmetry  axis  is  taken  to  be  zero  for  a quantal  system  since  rotation  of  the  potential  well 
about  the  symmetry  axis  corresponds  to  no  change  in  the  potential  well,  or  corresponding  rotation  of  the  bound 
nucleons.  That  is,  the  nucleus  is  not  a rigid  body,  the  nucleons  only  rotate  to  the  extent  that  the  ellipsoidal 
potential  well  is  cranked  around  such  that  the  nucleons  must  follow  the  rotation  of  the  potential  well.  In 
addition,  vibrational  modes  coexist  about  the  average  asymmetric  deformation,  plus  octupole  deformation 
often  coexists  with  the  above  quadrupole  deformed  modes. 
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11.23  Symmetric  rigid  rotor  subject  to  torque  about  a fixed  point 


The  motion  of  a symmetric  top  rotating  in  a gravitational  field,  with 
one  point  at  a fixed  location,  is  encountered  frequently  in  rotational 
motion.  Examples  are  the  gyroscope  and  a child’s  spinning  top. 
Rotation  of  a rigid  rotor  subject  to  torque  about  a fixed  point,  is  a 
case  where  it  is  necessary  to  take  the  inertia  tensor  with  respect  to 
the  fixed  point  in  the  body,  and  not  at  the  center  of  mass. 

Consider  the  geometry,  shown  in  figure  11.7,  where  the  symmet- 
ric top  of  mass  M is  spinning  about  a fixed  tip  that  is  displaced  by 
a distance  h from  the  center  of  mass.  The  tip  of  the  top  is  assumed 
to  be  at  the  origin  of  both  the  space-fixed  frame  {x,  y,  z)  and  the 
body-fixed  frame  (1,2,3).  Assume  that  the  translational  velocity 
is  zero  and  let  the  principal  moments  about  the  fixed  point  of  the 
symmetric  top  be  I\  = I?  7,3. 

The  Lagrange  equations  of  motion  can  be  derived  assuming  that 
the  kinetic  energy  equals  the  rotational  kinetic  energy,  that  is,  it  is 
assumed  that  the  translational  kinetic  energy  Ttrans  = 0.  Then  the 
kinetic  energy  of  an  inertially-symmetric  rigid  rotor  can  be  derived 
for  the  torque-free  symmetric  top  as  given  in  equation  11.145  to  be 


T 


— - ^ Uu if  — —I\  (wj  + u\)  + -/3W3  (11.174) 

i 

= i/i  (<fi2  sin2  0 + 02^J  + i/3  (tpcosO+ip'j 


Since  the  potential  energy  is  U = M gh  cos  0 then  the  Lagrangian 
equals 


(11.175)  Figure  11.7:  Symmetric  top  spinning 
about  one  fixed  point. 


1 


L=2  71 


2 o • 2\  i 
sin2  6 + 6 ) + -I3 


6 ^ + ^I3  [fcos9  + ipj  — MghcosQ  (11.176) 

The  angular  momentum  about  the  space-fixed  z axis  p $ is  conjugate  to  cp.  From  Lagrange’s  equations 


• dL  n 

Pt  = di  = 0 

that  is,  p,/,  is  a constant  of  motion  given  by  the  generalized  momentum 

dL 

Pcj,  — — - = (/i  sin2  6 + 13  cos2  6)<j>  + I3ip  cos  6 = Sz  = constant 
df 


(11.177) 


(11.178) 


where  Sz  is  the  angular  momentum  projection  along  the  space-fixed  2 axis. 

Similarly,  the  angular  momentum  about  the  body-fixed  3 axis  is  conjugate  to  if.  From  Lagrange’s  equations, 


• dL  n 

p*  = ai, =0 

that  is,  p^  is  a constant  of  motion  given  by  the  generalized  momentum 

dL 


p^j,  = 2—r.  = J3  I (j)  cos  6 + ip  ) = B3  = constant 
dip  ' 


(11.179) 


(11.180) 


where  B3  is  the  angular  momentum  projection  along  the  body-fixed  3 axis.  The  above  two  relations  can  be 
solved  to  give  the  precessional  angular  velocity  f about  the  space-fixed  2 axis 

X _ P<l>  ~ Pit  cos  6 _ Sz  - B3  cos  6 


7i  sin2  6 Ii  sin2  6 

and  the  spin  angular  velocity  ip  about  the  body-fixed  x3  axis 

B3  ( Sz  — B3  cos  6)  cos  6 

~h 


) _P±_  (m  ~ P cos  A) cos  # 

I3  Ji  sin2  6 


(11.181) 


(11.182) 


1 3 1 1 sm"  u 1 3 Iisin  6 

Since  and  p ^ are  constants  of  motion,  i.e.  S3,B3 , then  these  rotational  angular  velocities  depend  on  only 
Ji,  J3.  and  6. 
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There  is  one  further  constant  of  motion  available  if  no  frictional 
forces  act  on  the  system,  that  is,  energy  conservation.  This  implies 
that  the  total  energy 

E = -Ji  (j)  sin2  9 + 9 ^ + -J3  ^ipcos9  + ip'j  +Mghcos6  (11.183) 

will  be  a constant  of  motion.  But  the  middle  term  on  the  right-hand 
side  also  is  a constant  of  motion 

1 / • . \ 2 rp1  d2 

-I3  [ipcos9  + ip)  = -y-  — —4  = constant  (11.184) 

2 V / I3  I3 

Thus  energy  conservation  can  be  rewritten  by  defining  an  energy  E' 
where 


p'  — p ^ ^ 1 r 

E =E-%-A' 


sin 2 6 + 0 ) +Mgh  cos  6 = constant  (11.185) 


This  can  be  written  as 


g-V+k*  :rP',.7/)  + Mgfocosfl 
2 2/i  sin  6 


which  can  be  expressed  as 


E’  = l-hb2  + V (9) 


(11  186)  figure  11.8:  Effective  potential  dia- 
gram for  a spinning  symmetric  top 
as  a function  of  theta. 


(11.187) 


where  V ( 9 ) is  an  effective  potential 


V(9)  = 


_ {p</>  - P’t  cos  9Y 


2 Ii  sin“  9 


+ M gh cos  9 = 


(Sz  - B3  cos  9) 
2Ji  sin2  9 


Mgh  cos  9 


(11.188) 


The  effective  potential  V (9)  is  shown  in  figure  11.8.  It  is  clear  that  the  motion  of  a symmetric  top  with 
effective  energy  E ’ is  confined  to  angles  9\  < 9 < 92- 

Note  that  the  above  result  also  is  obtained  if  the  Routhian  is  used,  rather  than  the  Lagrangian,  as 
mentioned  in  chapter  8.7,  and  defined  by  equation  (8.65).  That  is,  the  Routhian  can  be  written  as 


R(9 , 9 , P^Pili) cyclic  ipp<j>  + ipPil>  P Hypi  P(fn  y 'i  Pi() ) •> 

1 . ^2  (p#  - Ptp  cos  9)2  , pi,  . , , , 


= --/l 


nrmcyclic 


(11.189) 


The  Routhian  R{9,9,p(f>p^)CyCiic  acts  like  a Hamiltonian  for  the  ((p,Pc/,)  and  ( ip,p variables  which  are 
constants  of  motion,  and  thus  are  ignorable  variables.  The  Routhian  acts  as  the  negative  Lagrangian  for  the 
remaining  variable  9,  with  rotational  kinetic  energy  p I)  0 and  effective  potential  energy  Veff 


to-tocmef  +pj  + Mghcose  = v(s)  + pJ 

2/i  sin  9 1 3 h 


The  equation  of  motion  describing  the  system  in  the  rotating  frame  is  given  by  one  Lagrange  equation 

d . ORcyclic  . dRcyclic  . , 

dt[  09  d9~  ~ 

The  negative  sign  of  the  Routhian  cancels  out  when  used  in  the  Lagrange  equation.  Thus,  in  the  rotating 
frame  of  reference,  the  system  is  reduced  to  a single  degree  of  freedom,  the  nutation  angle  9,  with  effective 
energy  E'  given  by  equations  11.186  — 11.188. 
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(a)  (b)  (c) 


Figure  11.9:  Nutational  motion  of  the  body-fixed  symmetry  axis  projected  onto  the  space-fixed  unit  sphere. 
The  three  case  are  (a)  <f  never  vanishes,  (b)  <f>  = 0 at  9 = 92  (c)  cf  changes  sign  between  6i  and  92 , 


The  motion  of  the  symmetric  top  is  simplest  at  the  minimum  value  of  the  effective  potential  curve,  where 
E'  = Knin,  at  which  the  nutation  9 is  restricted  to  a single  value  9 = 9q.  The  motion  is  a steady  precession 
at  a fixed  angle  of  inclination,  that  is,  the  "sleeping  top".  Solving  for  (%-)g_9o  = 0 gives  that 


P0  - Pg,  cos  9 = 


Pi,  sin2  6>0 
2 cos  90 


1±  „/l- 


AMghh  cos  9 o 
P% 


(11.190) 


If  0o  < § , then  to  ensure  that  the  solution  is  real  requires  a minimum  value  of  the  angular  momentum  on  the 
body-fixed  axis  of  p2,  > AMghI\  cos9q.  If  $o  > f then  there  is  no  minimum  angular  momentum  projection 
on  the  body-fixed  axis.  There  are  two  possible  solutions  to  the  quadratic  relation  corresponding  to  either  a 
slow  or  fast  precessional  frequency.  Usually  the  slow  precession  is  observed. 

For  the  general  case,  where  E[  > Vmin,  the  nutation  angle  9 between  the  space-fixed  and  body- fixed  3 
axes  varies  in  the  range  9 1 < 9 < 9 2.  This  axis  exhibits  a nodding  variation  which  is  called  nutation.  Figure 
11.9  shows  the  projection  of  the  body-fixed  symmetry  axis  on  the  unit  sphere  in  the  space-fixed  frame.  Note 
that  the  observed  nutation  behavior  depends  on  the  relative  sizes  of  p $ and  p ^ cos  9.  For  certain  values,  the 
precession  (f  changes  sign  between  the  two  limiting  values  of  9 producing  a looping  motion  as  shown  in  figure 
11.9c.  Another  condition  is  where  the  precession  is  zero  for  92  producing  a cusp  at  92  as  illustrated  in  figure 
11.96.  This  behavior  can  be  demonstrated  using  the  gyroscope  or  the  symmetric  top. 


11.12  Example:  The  Spinning  "Jack" 

The  game  "Jacks " is  played  using  metal  Jacks,  each  of  which  com- 
prises six  equal  masses  m at  the  opposite  ends  of  orthogonal  axes  of  length 
l.  Consider  one  jack  spinning  around  the  body-fixed  3—  axis  with  the  lower 
mass  at  a fixed  point  on  the  ground,  and  with  a steady  precession  around 
the  space-fixed  vertical  axis  z with  angle  9 as  shown.  Assume  that  the 
body-fixed  axes  align  with  the  arms  of  the  jack. 

The  principal  moments  of  inertia  about  one  mass  is  given  by  the  par- 
allel axis  theorem  to  be  I2  = I\  = Ami2  + 6ml2  = 10ml2  and  I3  = Ami2. 

In  the  rotating  body-fixed  frame  the  torque  due  to  gravity  has  compo- 
nents 

(6mgl  sin  9 sin  if  \ 

6mgl  sin  9 cos  if  I 

0 ! 

and  the  components  of  the  angular  velocity  are 

sin  9 sin  if  + 9 cos  if  \ 
cf  sin  9 cos  if  — 9 sin 
c t>  cos  9 + if  J 

Using  Euler’s  equations  (11.103)  for  the  above  components  of  N and 
a)  in  the  body-fixed  frame,  gives 


Z 


Jack  comprises  six  bodies  of 
mass  m at  each  end  of 
orthogonal  arms  of  length  l 
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lOwi 

- 6W2CU3 

6g  ■ 0 ■ 1 

= — sin  6 sin  ip 

(a) 

10w2 

— 6CU1W3 

69  ■ a 1 

= — sin  6 cos  ip 

(b) 

4 w3 

= 0 

(c) 

Equation  (c)  relates  the  spin  about  the  3 axis,  the  precession,  and  the  angle  to  the  vertical  9,  that  is 

u>3  = <j>  cos  9 + ip  = O cos  9 + s = constant 

where  ip  = s is  the  spin  and  ip  = 0 is  the  precession  angular  velocity. 

If  the  spin  axis  is  nearly  vertical,  0 « 0 and  thus  sin  9 « 9 and  cos  9 « 1 . Midtiply  equation  (a)  x sin  ip  + 
( b ) x cos  ip  and  using  the  equations  of  the  components  of  u > gives 

5 9 + ^2f2s  - 3fi2  - y ^ 9 = 0 

The  bracket  must  be  positive  to  have  stable  sinusoidal  oscillations.  That  is,  the  spin  angular  velocity  s 
required  for  the  jack  to  spin  about  a stable  vertical  axis  is  given  by. 

3Q  3g 

s>T  + 2m 

This  illustrates  the  conditions  required  for  stable  rotation  of  any  axially-symmetric  top. 

11.13  Example:  The  Tippe  Top 

The  Tippe  Top  comprises  a section  of  a sphere,  to 
which  a short  cylindrical  rod  is  mounted  on  the  planar 
section,  as  illustrated.  When  the  Tippe  Top  is  spun  on 
a horizontal  surface  this  top  exhibits  the  perverse  behav- 
ior of  transitioning  from  rotation  with  the  spherical  head 
resting  on  the  horizontal  surface,  to  flipping  over  such 
that  it  rotates  resting  on  its  elongated  cylindrical  rod. 

The  orientation  of  angular  momentum  remains  roughly 
vertical  as  expected  from  conservation  of  angidar  mo- 
mentum. This  implies  that  the  rotation  with  respect  to 
the  body-fixed  axes  must  invert  as  the  top  inverts.  The 
center  of  mass  is  raised  when  the  top  inverts;  the  addi- 
tional potential  energy  is  provided  by  a reduction  in  the 
rotational  kinetic  energy. 

The  Tippe  Top  behavior  was  first  discovered  in  the 
1890’ s but  adequate  solutions  of  the  equations  of  motion 
have  only  been  developed  since  the  1950’ s.  Since  the  top 
precesses  around  the  vertical  axis,  the  point  of  contact  is 
not  on  the  symmetry  axis  of  the  top.  Sliding  friction  be- 
tween the  surface  of  the  spinning  top  and  the  horizontal 
surface  provides  a torque  that  causes  the  precession  of 
the  top  to  increase  and  eventually  flip  up  onto  the  cylin- 
drical peg.  The  Tippe  Top  is  typical  of  many  phenomena 
in  physics  where  the  underlying  physics  principle  can  be 
recognized  but  a detailed  and  rigorous  solution  can  be  complicated. 

The  system  has  five  degrees  of  freedom,  x,y  which  specify  the  location  on  the  horizontal  plane,  plus  the 
three  Eider  angles  (g>,9,(p).  The  paper  by  Cohen[Coh77]  explains  the  motion  in  terms  of  Euler  angles  using 
the  laboratory  to  body-fixed  transformation  relation.  It  shows  that  friction  plays  a pivotal  role  in  the  motion 
contrary  to  some  earlier  claims.  Ciocci  and  Langerock[Cio07]  used  the  Routhian  RcycUc  to  reduce  the  number 


The  geometry  of  the  Tippe  Top  of  radius  R 
spinning  on  a horizontal  surface  with  slipping 
friction  acting  between  the  top  and  the 
horizontal  plane.  The  center  of  mass  is  a distance 
a from  the  center  of  the  spherical  section  along 
the  axis  of  symmetry  of  the  top. 
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of  degrees  of  freedom  from  5 to  2,  namely  9 which  is  the  tilt  angle,  and  p'  which  is  the  orientation  of  the 
tilt.  This  Routhian  Rcyciic  Is  a Lagrangian  in  two  dimension  that  was  used  to  derive  the  equations  of  motion 
via  the  Lagrange  Eider  equation 


d . It R. cyclic  . ^Rcyciic 

~dV  09  99 

d ORcyclic  \ dRcyclic 

dt  dip'  Op' 

where  the  Qe  Qp  are  generalized  torques  about  the  2 angles  that  take  into  account  the  sliding  frictional 
forces.  This  sophisticated  Routhian  reduction  approach  provides  an  exhaustive  and  refined  solution  for  the 
Tippe  Top  and  confirms  that  sliding  friction  plays  a key  role  in  the  unusual  behavior  of  the  Tippe  Top. 

11.24  The  rolling  wheel 

As  discussed  in  chapter  5.7,  the  rolling  wheel  is  a non-holonomic  system  that  is  simple  in  principle,  but  in 
practice  the  solution  can  be  complicated,  as  was  illustrated  by  the  Tippe  Top.  Chapter  11.22  discussed  the 
motion  of  a symmetric  top  rotating  about  a fixed  point  on  the  symmetry  axis  when  subject  to  a torque.  The 
rolling  wheel  also  involves  rotation  of  a symmetric  body  that  is  subject  to  torques.  However,  the  point  of 
contact  of  the  wheel  with  a static  plane  is  on  the  periphery  of  the  wheel,  and  friction  at  the  point  of  contact 
is  assumed  to  ensure  zero  slip.  Note  that  friction  is  necessary  to  ensure  that  the  rotating  object  rolls  without 
slipping,  but  the  frictional  force  does  no  work  for  pure  rolling. 

The  coordinate  system  employed  is  shown  in  Figure  11.10.  For  simplicity  it  is  better  to  use  a moving 
coordinate  frame  (1,2,3)  that  is  fixed  to  the  orientation  of  the  wheel  with  the  origin  at  the  center  of  mass 
of  the  wheel,  but  this  moving  reference  frame  does  not  include  the  angular  velocity  ip  of  the  disk  about  the 
3 axis.  That  is,  the  moving  (1,  2,  3)  frame  has  angular  velocities 

uq  = 9 (11.191) 

u>2  = f>  sin  9 

U3  = (p  cos  9 

The  frame  fixed  in  the  rotating  wheel  must  include  the  additional  angular  velocity  of  the  disk  ip  about 
the  §3  axis,  that  is 

= uq=6>  (11.192) 

0.2  = w>2  = (psin9 

03  = 1x3  + ip  = <p  cos  9 + ip 

where  O designates  the  angular  velocity  of  the  rotating  disk,  while  m designates  the  rotation  of  the  moving 
frame  (1,  2,  3). 

For  a thin  disk  the  moment  of  inertia  are  related  by  the  perpendicular  axis  theorem  (chapter  11.9) 


= Qe 
— Q y)' 


Il+  I2  = I3 


Since  I\  = I2  for  a uniform  disk,  therefore  I3  = 2I\. 

Equation  10.16  can  be  used  to  relate  the  vector  forces  F in  the  space-fixed  frame  to  the  rate  of  change 
of  momenta  in  the  moving  frame  (1,  2,  3) . 

F 1 P space  = Pmoving  + OJ  X p (11.193) 

This  leads  to  the  following  relations  for  the  three  components  in  the  moving  frame 

Fi  = pi  + W2P3  ^ w3p2  (11.194) 

F2-  Mg  sin  9 = p2  + w3?q  - uq  p3 
F3  - M g cos  9 = P3+  UJ1P2  - W2P1 
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Z 


Figure  11.10:  Uniform  disk  rolling  on  a horizontal  plane.  The  space-fixed  axis  system  is  (x,  y,  z)  , while 
the  moving  reference  frame  (1,  2,3)  is  centered  at  the  center  of  mass  of  the  disk  with  the  1,  2 axes  in  the 
plane  of  the  disk.  The  disk  is  rotating  with  a uniform  angular  velocity  ^ about  the  3 axis  and  rolling  in  the 
direction  that  is  an  angle  4>  relative  to  the  x axis. 


where  F\ , , F3  are  the  reactive  forces  acting  shown  in  figure  11.10. 

Similarly,  the  torques  N in  the  space- fixed  frame  can  be  related  to  the  rate  of  change  of  angular  momentum 
by 

N — L space  — L moving  “h  ^ X L (11.195) 

where  L,=  Ijfij.  This  leads  to  the  following  relations  for  the  three  torque  equations  in  the  moving  frame 

N\  = — F3R  = I1&1  + I3O3U12  — I2Q2U3  (11.196) 

N2  = 0 = I \ Cl  2 + hD,iUJ3  — /3H3CU1 

N3  = F\  R = I3CI3  + I2O0UJ1  — /1O1W2 


The  rolling  constraints  are 


Pi  + M RII3  = 0 
P2  = 0 

P3  — MRCli  = 0 

where  Pi  = M v-i . Combining  equations  11.194, 11.196, 11.197  gives 
(/|  T Mi?2)  fl|  T (I3  -)-  ilii?2)  ^2^3  — I2OJ3O2 
F CI2  + I1M3O1  — I31X1O3 
(i3  + Mi?-)  CI3  I2UJ1O2  ~ (ii  + Mi?-)  UJ2O1 


—MgR  cos0 
0 
0 


(11.197) 


(11.198) 


These  can  be  recognized  to  be  the  torque  equations  about  the  point  of  contact  O. 

Introduction  of  equations  11.191  and  11.192  into  equation  11.198  expresses  the  equations  of  motion  in 
terms  of  the  Euler  angles  to  be 


(ii  + Mi?2)  9 + (i3  + MR 2)  cj)sin9  ((/>cos0  + ip)  — ii ip2  sin 9 cos 9 
Ii'(psin9  + 2Iiij)9cos9  — i3d  ^cos0  + ■ip) 
(i3  + Mi?2)  C<j)  cos  9 — (pi)  sin  9 + ip)  — MR?6<p  sin  9 


—MgRcosO  (11.199) 
0 


0 
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Equations  11.199  are  non-linear,  and  a closed-form  solution  is  possible  only  for  limited  cases  such  as  when 
0 = 90°. 

Note  that  the  above  equations  of  motion  also  can  be  derived  using  Lagrangian  mechanics  knowing  that 
L = + 1>2  + v2)  + i/i  (flf  + f if)  + — MgRcosO 

The  differential  equations  of  constraint  can  be  derived  from  equations  11.197  to  be 

dx  — R cos  (fdif  = 0 
dy  — R sin  (f)dip  = 0 

Use  of  generalized  forces  plus  the  Lagrange-Euler  equations  (6.47)  can  be  used  to  derive  the  equations  of 
motion  and  solve  for  the  components  of  the  constraint  force  F\ , F2 , and  F3. 


11.14  Example:  Tipping  stability  of  a rolling  wheel 


A circular  wheel  rolling  in  a vertical  plane  at  high  angular  velocity  initially  rolls  in  a straight  line  and 
remains  vertical.  However,  below  a certain  angular  velocity,  gyroscopic  forces  become  weaker  and  it  will 
tip  sideways  and  veer  rapidly  from  the  initial  direction.  It  is  interesting  to  estimate  the  minimum  angular 
velocity  of  the  disk  such  that  it  does  not  start  to  tip  over  sideways. 

Note  that  equations  11.199  are  satisfied  for  9 = §,  <j>  = 0 and  f = U3  = constant.  Assume  a small 
disturbance  causes  the  tilt  angle  9 = £ + a where  a is  small  and  that  <j>  is  non-zero  but  small,  that  is  9 = a 
and  </>  are  small.  Keeping  only  terms  to  first  order  in  the  third  of  equations  11.199,  and  integrating  gives 


(f>  cos  9 + f = U3 

The  first  two  of  equations  11.198  become 

{1 1 + MR2)  a + (I 3 + MR2)  (ffl3  — MgRa  = 0 

Ii4>  — ELlsa  = 0 

Integrating  equation  (c)  gives 


ELI 


3“3 


-a 


Inserting  (d)  into  (b)  gives 


(. h + MR2)a  + 


2\  13^3 


(I3  + MR2)  —j — — - MgR 


a = 0 


Equation  (e)  has  a stable  oscillatory  solution  when  the  square  bracket  in  positive,  that  is, 


(a) 

(b) 

(c) 

(d) 


(e) 


2 ^ I\MgR 
!s  > h (h  + MR 2) 


(f) 


which  gives  the  minimum  angular  velocity  required  for  stable  rolling  motion.  For  angidar  velocity  less  than 
the  minimum,  the  square  bracket  in  equation  (e)  is  negative  leading  to  an  exponentially  decaying  divergent 
solution.  For  a uniform  disk  the  perpendicular  axis  theorem  gives  I3  = 2/i  = \MR2  for  which  equation  f 
gives 

A > fn  « 

Therefore  the  critical  linear  velocity  of  the  wheel  is 


v = RLls  > 


(h) 


The  bicycle  wheel  provides  a common  example  of  the  tipping  of  a rolling  wheel.  For  the  typical  0.35 m 
radius  of  a bicycle  ivheel,  this  gives  a critical  velocity  of  v > 1.07 m/s  = 2Amphf 

4 The  stability  of  the  bicycle  is  sensitive  to  the  castor  and  other  aspects  of  the  steering  geometry  of  the  front  wheel,  in  addition 
to  gyroscopic  effects.  Excellent  articles  on  this  subject  have  been  written  by  D.E.H.  Jones  Physics  Today  23(4)  (1970)  34,  and 
also  by  J.  Lowell  & H.D.  McKell,  American  Journal  of  Physics  50  (1982)  1106. 
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11.15  Example:  Pivoting 

The  difference  between  a rolling  and  a pivoting  body  can  lead  to  confusion  as  to  whether  to  compute  the 
angular  momentum  and  kinetic  energy  with  respect  to  the  center  of  mass,  or  the  point  of  contact  on  the 
circumference  of  the  body  for  rolling,  or  of  the  pivot  point  for  a fixed  pivot.  It  is  useful  to  compare  the 
angular  momentum  and  total  energy  computed  with  respect  to  (1)  the  center  of  mass  of  a cylinder  and  (2) 
with  respect  to  the  point  of  contact  of  the  cylinder  and  the  plane  for  pivoting  or  rolling. 

Consider  a cylinder  of  radius  R and  mass  m pivoting  about  the  point  of  contact  with  the  plane  with 
angular  velocity  ui  = where  v is  the  instantaneous  velocity  of  the  center  of  mass.  The  angular  momentum 
about  the  pivot  point  is 

Dpiyot  = R X V IpiyQlUJ 

The  parallel-axis  theorem  relates  the  moment  of  inertia  with  respect  to  the  pivot  point  and  center  of  mass 


kpivot  mR  1,. 


The  angidar  velocities  of  the  center  of  mass,  and  about  the  center  of  mass,  are  identical  since  the  pivot  point 
is  fixed,  that  is 

^ pivot  — ^ cm  — ^ 

Thus  the  angular  momentum  about  the  pivot  point  is  given  by  the  sum  of  the  angular  momenta 

L pivot  — I pivot  a:  — mR  -j-  Icmu: 


That  is,  the  angular  momentum  is  the  sum  of  the  angular  momentum  of  the  body  about  the  center  of  mass 
plus  the  angular  momentum  of  the  center  of  mass  about  the  pivot  point.  This  is  an  example  of  Chasles 
theorem. 

The  kinetic  energy  is  given  only  by  the  rotational  energy  since  the  pivot  point  is  stationary 

KEpxvot  = -Ipivotw2  = -mi?V  + -IcmU2  = -mv2  + -Icmw2 

That  is,  it  equals  the  kinetic  energy  of  rotation  about  the  center  of  mass  plus  the  instantaneous  kinetic  energy 
for  translation  of  the  center  of  mass  in  agreement  with  Chasles  theorem.  Thus  for  pivoting  the  angidar 
momentum  and  kinetic  energy  are  the  same  if  evaluated  using  either  center  of  mass  coordinates  or  using  the 
pivot  point  as  the  reference  point. 


11.16  Example:  Rolling 

Consider  the  same  system  except  the  cylinder  is  rolling  without  slipping  on  a plane.  The  subtle  difference 
between  pivoting  and  rolling  is  that  the  rolling  point  of  contact  and  the  center  of  mass  are  moving  at  the  same 
velocity  in  contrast  to  pivoting  where  the  point  of  contact  is  stationary.  Thus  for  rolling  there  is  no  angular 
momentum  of  the  center  of  mass  with  respect  to  the  point  of  contact.  Therefore  the  angular  momentum  about 
the  instantaneous  point  of  contact  is 


-trolling 


= L 


pivot 


H-  LCm  — m,R  0 Icmuj  — Iat 


That  is,  the  angidar  momentum  only  includes  the  angular  momentum  about  the  center  of  mass  which  is 
smaller  than  the  angular  momentum  for  the  same  body  pivoting  about  a point  on  the  periphery  of  the  cylinder. 
The  kinetic  energy  is  given  by 

R Eron  = -mV  — Rolling^  = T 

Thus  the  angular  momentum  is  significantly  smaller  for  rolling  relative  to  pivoting  of  a given  body,  whereas 
the  kinetic  energy  is  the  same  for  both  rolling  or  pivoting  of  a given  body. 
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11.25  Dynamic  balancing  of  wheels 

It  is  crucial  for  rotating  machinery  that  rotors  be  both  statically  and  dynamically  balanced.  Static  balance 
means  that  the  center  of  mass  is  on  the  axis  of  rotation.  Dynamic  balance  means  that  the  axis  of  rotation  is 
a principal  axis. 

For  example,  consider  the  symmetric  rotor  that  has  its  symmetry  axis  at  an  angle  <f>  to  the  axis  of  rotation. 
In  this  case  the  system  is  statically  balanced  since  the  center  of  gravity  is  on  the  axis  of  rotation.  However, 
the  rotation  axis  is  at  an  angle  </>  to  the  symmetry  axis.  This  implies  that  the  axle  has  to  provide  a torque 
to  maintain  rotation  that  is  not  along  a principal  axis.  If  you  distort  the  front  wheel  of  your  car  by  hitting  it 
sideways  against  the  sidewalk  curb,  or  if  the  wheel  is  not  dynamically  balanced,  then  you  will  find  that  the 
steering  wheel  can  vibrate  wildly  at  certain  speeds  due  to  the  torques  caused  by  dynamic  imbalance  shaking 
the  steering  mechanism.  This  can  be  especially  bad  when  the  frequency  is  close  to  a resonant  frequency 
of  the  suspension  system.  Insist  that  your  automobile  wheels  are  dynamically  balanced  when  you  change 
tires,  static  balancing  will  not  eliminate  the  dynamic  imbalance  forces.  Another  example  is  that  the  ailerons, 
rudder,  and  elevator  on  aircraft  usually  are  dynamically  balanced  to  stop  the  build  up  of  oscillations  that 
can  couple  to  flexing  and  flutter  of  the  airframe  which  can  lead  to  airframe  failure. 


11.17  Example:  Forces  on  the  bearings  of  a rotating  circular  disk 


A homogeneous  circular  disk  of  mass  M,  and  radius  R , 
rotates  with  constant  angular  velocity  w about  a body-fixed 
axis  passing  through  the  center  of  the  circular  disk  as  shown 
in  the  adjacent  figure.  The  rotation  axis  is  inclined  at  an 
angle  a to  the  symmetry  axis  of  the  circular  disk  by  bearings 
on  both  sides  of  the  disk  spaced  a distance  d apart.  Determine 
the  forces  on  the  bearings. 

Choose  the  body-fixed  axes  such  that  e 3 is  along  the  sym- 
metry axis  of  the  circular  disk,  and  e\  points  in  the  plane  of 
the  disk  symmetry  axis  and  the  rotation  axis.  These  axes  are 
the  principal  axes  for  which  the  inertia  tensor  can  be  calcu- 
lated to  be 


MR 2 
4 


1 0 0 \ 
0 10 
0 0 2 / 


Note  that  for  this  thin  plane  laminae  disk  In  + 122  = 133-  Rotation  of  circular  disk  about  an  axis  that 
The  components  of  the  angular  velocity  vector  u>  along  the  is  at  an  angle  a to  the  symmetry  axis  of  the 
three  body-fixed  axes  are  given  by  circular  disk. 

u>  =(tu  sin  a,  0,  u cos  a) 


Since  it  is  assumed  that  Cj  = 0 then  substituting  into  Eider’s  equations  (11.103)  gives  the  torques  acting  to 
be 


Ni  — 7V3  = 0 

N2  = —u)2  sin  a cos  a ^ M R2 


That  is,  the  torque  is  in  the  e.2  direction.  Thus  the  forces  F on  the  bearings  can  be  calculated  since  N 
thus 


1*1 


M = MRV  — 
2d  16d 


= r x F, 


Estimate  the  size  of  these  forces  for  the  front  wheel  of  your  car  travelling  at  70  m.p.h.  if  the  rotation  axis  is 
displaced  by  2°  from  the  symmetry  axis  of  the  wheel. 
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Figure  11.11:  Forward  two-and-a-half  somersaults  with  two  twists  demonstrates  unequivocally  that  a diver 
can  initiate  continuous  twisting  in  midair.  In  the  illustrated  maneuver  the  diver  does  more  than  one  full 
somersault  before  he  starts  to  twist.  To  maintain  the  twisting  the  diver  does  not  have  to  move  his  legs.[Fro80] 


11.26  Rotation  of  deformable  bodies 

The  discussion  in  this  chapter  has  assumed  that  the  rotating  body  is  a rigid  body.  However,  there  is  broad 
and  important  class  of  problems  in  classical  mechanics  where  the  rotating  body  is  deformable  which  leads 
to  intriguing  new  phenomena.  The  classic  example  is  the  cat,  which,  if  dropped  upside  down  with  zero 
angular  momentum,  is  able  to  distort  its  body  plus  tail  in  order  to  rotate  such  that  it  lands  on  its  feet  in 
spite  of  the  fact  that  there  are  no  external  torques  acting  and  thus  the  angular  momentum  is  conserved. 
Another  example  is  the  high  diver  doing  a forward  two-and-a-half  somersault  with  two  twists.  [Fro80]  Once 
the  diver  leaves  the  board  then  the  total  angular  momentum  must  be  conserved  since  there  are  no  external 
torques  acting  on  the  system.  The  diver  begins  a somersault  by  rotating  about  a horizontal  axis  which  is  a 
principal  axis  that  is  perpendicular  to  the  axis  of  his  body  passing  through  his  hips.  Initially  the  angular 
momentum,  and  angular  velocity,  are  parallel  and  point  perpendicular  to  the  symmetry  axis.  Initially  the 
diver  goes  into  a tuck  which  greatly  reduces  his  moment  of  inertia  along  the  axis  of  his  somersault  which 
concomitantly  increases  his  angular  velocity  about  this  axis  and  he  performs  one  full  somersault  prior  to 
initiating  twisting.  Then  the  diver  twists  its  body  and  moves  its  arms  to  destroy  the  axial  symmetry  of  his 
body  which  changes  the  direction  of  the  principal  axes  of  the  inertia  tensor.  This  causes  the  angular  velocity 
to  change  in  both  direction  and  magnitude  such  that  the  angular  momentum  remains  conserved.  The  angular 
velocity  now  is  no  longer  parallel  to  the  angular  momentum  resulting  in  a component  along  the  length  of 
the  body  causing  it  to  twist  while  somersaulting.  This  twisting  motion  will  continue  until  the  symmetry 
of  the  diver’s  body  is  restored  which  is  done  just  before  entering  the  water.  By  skilled  timing,  and  body 
movement,  the  diver  restores  the  symmetry  of  his  body  to  the  optimum  orientation  for  entering  the  water. 
Such  phenomena  involving  deformable  bodies  are  important  to  motion  of  ballet  dancers,  jugglers,  astronauts 
in  space,  and  satellite  motion.  The  above  rotational  phenomena  would  be  impossible  if  the  cat  or  diver  were 
rigid  bodies  having  a fixed  inertia  tensor.  Calculation  of  the  dynamics  of  the  motion  of  deformable  bodies 
is  complicated  and  beyond  the  scope  of  this  book,  but  the  concept  of  a time  dependent  transformation  of 
the  inertia  tensor  underlies  the  subsequent  motion.  The  theory  is  complicated  since  it  is  difficult  even  to 
quantify  what  corresponds  to  rotation  as  the  body  morphs  from  one  shape  to  another.  Further  information 
on  this  topic  can  be  found  in  the  literature.  [Fro80] 
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11.27  Summary 


This  chapter  has  introduced  the  important,  topic  of  rigid-body  rotation  which  has  many  applications  in 
physics,  engineering,  sports,  etc. 


Inertia  tensor  The  concept  of  the  inertia  tensor  was  introduced  where  the  9 
tensor  are  given  by 


components  of  the  inertia 


(11.14) 


Steiner’s  parallel-axis  theorem 


Jn  = In  + M ((af  + a|  + <23)  <5n  — af')  — In  + M ( a i|  + 0,3) 


(11.43) 


relates  the  inertia  tensor  about  the  center-of-mass  to  that  about  parallel  axis  system  not  through  the  center 
of  mass. 

Diagonalization  of  the  inertia  tensor  about  any  point  was  used  to  find  the  corresponding  Principal  axes 
of  the  rigid  body. 


Angular  momentum  The  angular  momentum  L for  rigid-body  rotation  is  expressed  in  terms  of  the 
inertia  tensor  and  angular  frequency  ix  by 


L= 


hi 

hi 

I 13  \ 

/ IX 1 

1-21 

I22 

hi 

1x2 

hi 

I32 

hi  / 

V 

= {1} 


Rotational  kinetic  energy  The  rotational  kinetic  energy  is 


Trot  — 2 ( Wl  u 2 W3  ) 


T 


/ Ju 

I12 

hi  \ 

( Wl 

/21 

I 22 

hi 

w2 

V hi 

I32 

hi  ) 

V ^3 

—ix  ■ L 

(11.56) 


(11.72) 

(11.73) 


Euler  angles  The  Euler  angles  relate  the  space-fixed  and  body-fixed  principal  axes.  The  angular  velocity 
ix  expressed  in  terms  of  the  Euler  angles  has  components  for  the  angular  velocity  in  the  body-fixed  axis  system 


(1,2,3) 

wi  = <fi  + 61  + ip1  = 0 sin  0 sin  + 6 costj)  (11.86) 

W2  = </>2  T O2  + 1P2  — <t>  sin  9 cos  ip  — 9 sin  (11.87) 

w3  = <^3  + ^3  + ^3  = f cos  0 + (11.88) 

Similarly,  the  components  of  the  angular  velocity  for  the  space-fixed  axis  system  (x,  y,  z)  are 

= 6 cos  f sin  9 sin  <j>  (11.89) 

u)y  = 6 sin  <j>  — ip  sin  6 cos  f>  (11.90) 

L j~  = (f)  + cosO  (11.91) 


Rotational  invariants  The  powerful  concept  of  the  rotational  invariance  of  scalar  properties  was  intro- 
duced. Important  examples  of  rotational  invariants  are  the  Hamiltonian,  Lagrangian,  and  Routhian. 


Euler  equations  of  motion  for  rigid-body  motion  The  dynamics  of  rigid-body  rotational  motion  was 
explored  and  the  Euler  equations  of  motion  were  derived  using  both  Newtonian  and  Lagrangian  mechanics. 


Nfx1 

= hd>i  — (h  ^ h)  W2W3 

(11.103) 

Next 

= h<X2  ^ (h  ^ h)  W3W1 

Nlxt 

= hhi  — (h  — h)  X1X2 
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Lagrange  equations  of  motion  for  rigid-body  motion  The  Euler  equations  of  motion  for  rigid-body 
motion,  given  in  equation  11.103,  were  derived  using  the  Lagrange- Euler  equations. 

Torque-free  motion  of  rigid  bodies  The  Euler  equations  and  Lagrangian  mechanics  were  used  to  study 
torque-free  rotation  of  both  symmetric  and  asymmetric  bodies  including  discussion  of  the  stability  of  torque- 
free  rotation. 

Rotating  symmetric  body  subject  to  a torque  The  complicated  motion  exhibited  by  a symmetric  top, 
that  is  spinning  about  one  fixed  point  and  subject  to  a torque,  was  introduced  and  solved  using  Lagrangian 
mechanics. 

The  rolling  wheel  The  non-holonomic  motion  of  rolling  wheels  was  introduced,  as  well  as  the  importance 
of  static  and  dynamic  balancing  of  rotating  machinery.. 

Rotation  of  deformable  bodies  The  complicated  non-holonomic  motion  involving  rotation  of  deformable 
bodies  was  introduced. 
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Workshop  exercises 

1.  Three  objects  are  described  below.  Break  up  into  three  groups,  one  group  per  object,  and  determine  the  inertia 
tensor. 

• A very  thin  sheet  with  a mass  density  a = Cxy  where  C is  a positive  constant.  The  sheet  lies  in  the  xy 
plane  and  its  sides  are  both  of  length  a. 

• An  inclined-plane  shaped  block  of  mass  M is  oriented  with  one  corner  at  the  origin  as  shown. 


z 


• An  equilateral  triangle  made  up  of  three  thin  rods  of  length  l and  uniform  mass  density  p. 

2.  Consider  the  objects  described  in  problem  1. 

(a)  For  the  first  object  (the  thin  sheet),  determine  the  principal  moments  of  inertia. 

(b)  For  the  second  object  (the  inclined  plane),  determine  the  principal  axes. 

(c)  For  the  third  object  (the  equilateral  triangle),  determine  the  products  of  inertia. 

3.  Consider  the  inertia  tensor. 

(a)  What  are  the  advantages  of  diagonalizing  the  inertia  tensor? 

(b)  How  can  the  inertia  tensor  be  diagonalized? 

(c)  What  can  you  say  about  a tensor  that  is  real  and  symmetric? 

4.  A hollow  spherical  shell  has  a mass  to  and  radius  R. 

(a)  Calculate  the  inertia  tensor  for  a set  of  coordinates  whose  origin  is  at  the  center  of  mass  of  the  shell. 

(b)  Now  suppose  that  the  shell  is  rolling  without  slipping  toward  a step  of  height  h,  where  h < R.  The  shell 
has  a linear  velocity  V.  What  is  the  angular  momentum  of  the  shell  relative  to  the  tip  of  the  step? 

(c)  The  shell  now  strikes  the  tip  of  the  step  inelastically  (so  that  the  point  of  contact  sticks  to  the  step, 
but  the  shell  can  still  rotate  about  the  tip  of  the  step).  What  is  the  angular  momentum  of  the  shell 
immediately  after  contact? 

(d)  Finally,  find  the  minimum  velocity  which  enables  the  shell  to  surmount  the  step.  Express  your  result  in 
terms  of  to,  g,  R,  and  h. 

5.  The  vectors  x,  y,  and  z constitute  a set  of  orthogonal  right-handed  axes.  The  vectors  x + y — 2 z,  —x  + y,  and 
x + y + z are  also  perpendicular  to  one  another. 

(a)  Write  out  the  set  of  direction  cosines  relating  the  new  axes  to  the  old. 

(b)  How  are  the  Eulerian  angles  defined?  Describe  this  transformation  by  a set  of  Eulerian  angles. 
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6.  A torsional  pendulum  consists  of  a vertical  wire  attached  to  a mass  which  can  rotate  about  the  vertical  axis. 
Consider  three  torsional  pendula  which  consist  of  identical  wires  from  which  identical  homogeneous  solid  cubes 
are  hung.  One  cube  is  hung  from  a corner,  one  from  midway  along  an  edge,  and  one  from  the  middle  of  a face 
as  shown.  What  are  the  ratios  of  the  periods  of  the  three  pendula? 


7.  A dumbbell  comprises  two  equal  point  masses  M connected  by  a massless  rigid  rod  of  length  2 A which  is 
constrained  to  rotate  about  an  axle  fixed  to  the  center  of  the  rod  at  an  angle  9 as  shown  in  the  figure.  The 
center  of  the  rod  is  at  the  origin  of  the  coordinates,  the  axle  along  the  2-axis,  and  the  dumbbell  lies  in  the 
X — y plane  at  t = 0.  The  angular  velocity  w is  a constant  in  time  and  is  directed  along  the  Z axis. 

a)  Calculate  all  elements  of  the  inertia  tensor.  Be  sure  to  specify  the  coordinate  system  used. 

b)  Using  the  calculated  inertia  tensor  find  the  angular  momentum  of  the  dumbbell  in  the  laboratory  frame  as 
a function  of  time. 

c)  Using  the  equation  L = r X p,  calculate  the  angular  momentum  and  show  that  it  it  is  equal  to  the  answer 
of  part  (b). 

d)  Calculate  the  torque  on  the  axle  as  a function  of  time. 

e)  Calculate  the  kinetic  energy  of  the  dumbbell. 


X 

4 


8.  A heavy  symmetric  top  has  a mass  m with  the  center  of  mass  a distance  h from  the  fixed  point  about  which 
it  spins  and  ii  = J2  7^  1$.  The  top  is  processing  at  a steady  angular  velocity  about  the  vertical  space-fixed 
2 axis.  What  is  the  minimum  spin  uj  about  the  body-fixed  symmetry  axis,  that  is,  the  3 axis  assuming  that 
the  3 axis  is  inclined  at  an  angle  9 = 9 with  respect  to  the  vertical  2 axis.  Solve  the  problem  at  the  instant 
when  the  2,  X,  3, 1 axes  all  are  in  the  same  plane  as  shown  in  the  figure. 


1 
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9.  Consider  an  object  with  the  center  of  mass  is  at  the  origin  and  inertia  tensor, 

/ 1/2  -1/2  0 \ 

/ = / -1/2  1/2  0 

V 0 0 1 J 

(a)  Determine  the  principal  moments  of  inertia  and  the  principal  axes.  Guess  the  object. 

(b)  Determine  the  rotation  matrix  R and  compute  R^  I R.  Do  the  diagonal  elements  match  with  your  results 
from  (a)?  Note:  columns  of  R are  eigenvectors  of  I. 

(c)  Assume  ui  = + z).  Determine  L in  the  rotating  coordinate  system.  Are  L and  u>  in  the  same 

direction?  What  does  this  mean? 

(d)  Repeat  (c)  for  uj  = -^=  (x  — y) . What  is  different  and  why? 

(e)  For  which  case  will  there  be  a non-zero  torque  required? 

(f)  Determine  the  rotational  kinetic  energy  for  the  case  u>  = -^=(x  — y)? 


10.  Consider  a wheel  (solid  disk)  of  mass  m and  radius  r.  The  wheel  is  subject  to  angular  velocities  u>a  = Wa  n 
where  h is  normal  to  the  surface  and  u>b  — ojb  z. 


(a)  Choose  a set  of  principal  axes  by  observation. 

(b)  Determine  the  angular  velocities  and  angular  momentum  along  the  principal  axes.  Note:  I\  = ^ mr 2 and 
h = h = jirir2. 

(c)  Determine  the  torque. 

(d)  Determine  the  rotation  matrix  that  rotates  the  fixed  coordinate  system  to  the  body  coordinate  system. 


11.  Determine  the  principal  moments  of  inertia  of  an  ellipsoid  given  by  the  equation, 


12. 


Determine  the  principal  moments  of  inertia  of  a sphere  of  radius  R with  a cavity  of  radius  r located  e from  the 
center  of  the  sphere. 


13.  Three  equal  masses  m form  the  vertices  of  an  equilateral  triangle  of  side  length  L.  The  masses  are  located  at 
^0,  0,  -^=  \ , ^0,  i and  ^0,  — j,  — ■ such  that  the  center-of-mass  is  located  at  the  origin. 


(a)  Determine  the  principal  moments  of  inertia  and  principal  axes. 

Now  consider  the  same  system  rotated  45°  about  the  5-axis.  The  masses  are  located  at  ^0,  0,  , 

and  ( 571  ’ - 575  ’ - 57s)  ’ resPectively- 

(b)  Determine  the  principal  moments  of  inertia  and  principal  axes. 

(c)  Could  you  have  answered  (b)  without  explicitly  determining  the  inertia  tensor?  How? 


L L 
275’  275’ 
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Problems 

1.  Calculate  the  moments  of  inertia  for  a homogeneous  cone  of  mass  M whose  height  is  h and  whose 

base  has  a radius  R.  Choose  the  X3-axis  along  the  symmetry  axis  of  the  cone. 

a)  Choose  the  origin  at  the  apex  of  the  cone,  and  calculate  the  elements  of  the  inertia  tensor. 

b)  Make  a transformation  such  that  the  center  of  mass  of  the  cone  is  the  origin  and  find  the  principal  moments 
of  inertia. 

2.  Four  masses,  all  of  mass  to,  lie  in  the  x — y plane  at  positions  (x,  y)  = (a,  0),  (—a,  0),  (0,  +2a),  (0,  —2 a)T. 
These  are  joined  by  massless  rods  to  form  a rigid  body 

(a)  Find  the  inertial  tensor,  using  the  x,  y,  z axes  as  a reference  system.  Exhibit  the  tensor  as  a matrix. 

(b)  Consider  a direction  given  by  the  unit  vector  h that  lies  equally  between  the  positive  x,  y,  z axes;  that  is 
it  makes  equal  angles  with  these  three  directions.  Find  the  moment  of  inertia  for  rotation  about  this  h axis. 

(c)  Given  that  at  a certain  time  t the  angular  velocity  vector  lies  along  the  above  direction  n,  find,  for  that 
instant,  the  angle  between  the  angular  momentum  vector  and  n. 

3.  A homogeneous  cube,  each  edge  of  which  has  a length  l,  initially  is  in  a position  of  unstable  equilibrium  with 
one  edge  of  the  cube  in  contact  with  a horizontal  plane.  The  cube  then  is  given  a small  displacement  causing 
it  to  tip  over  and  fall.  Show  that  the  angular  velocity  of  the  cube  when  one  face  strikes  the  plane  is  given  by 

where  A = | if  the  edge  cannot  slide  on  the  plane,  and  where  A = ^ if  sliding  can  occur  without  friction. 

4.  A symmetric  body  moves  without  the  influence  of  forces  or  torques.  Let  X$  be  the  symmetry  axis  of  the  body 
and  L be  along  x'3.  The  angle  between  u)  and  X3  is  a.  Let  u)  and  L initially  be  in  the  X2  — X3  plane.  What  is 
the  angular  velocity  of  the  symmetry  axis  about  L in  terms  of  R,  I3,  LU, and  op. 

5.  Consider  a thin  rectangular  plate  with  dimensions  a by  & and  mass  M.  Determine  the  torque  necessary  to 
rotate  the  thin  plate  with  angular  velocity  UJ  about  a diagonal.  Explain  the  physical  behavior  for  the  case  when 
a = b. 
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Coupled  linear  oscillators 


12.1  Introduction 

Chapter  3 discussed  the  behavior  of  a single  linearly-damped  linear  oscillator  subject  to  a harmonic  force. 
No  account  was  taken  for  the  influence  of  the  single  oscillator  on  the  driver  for  the  case  of  forced  oscillations. 
Many  systems  in  nature  comprise  complicated  free  or  forced  oscillations  of  coupled-oscillator  systems.  Ex- 
amples of  coupled  oscillators  are;  automobile  suspension  systems,  electronic  circuits,  electromagnetic  fields, 
musical  instruments,  atoms  bound  in  a crystal,  neural  circuits  in  the  brain,  networks  of  pacemaker  cells  in 
the  heart,  etc.  Energy  can  be  transferred  back  and  forth  between  coupled  oscillators  as  the  motion  evolves. 
However,  it  is  possible  to  describe  the  motion  of  coupled  linear  oscillators  in  terms  of  a sum  over  independent 
normal  coordinates,  i.e.  normal  inodes,  even  though  the  motion  may  be  very  complicated.  These  normal 
modes  are  constructed  from  the  original  coordinates  in  such  a way  that  the  normal  modes  are  uncoupled. 
The  topic  of  finding  the  normal  modes  of  coupled  oscillator  systems  is  a ubiquitous  problem  encountered  in 
all  branches  of  science  and  engineering.  As  discussed  in  chapter  4,  oscillatory  motion  of  non-linear  systems 
can  be  complicated.  Fortunately  most  oscillatory  systems  are  approximately  linear  when  the  amplitude  of 
oscillation  is  small.  This  discussion  assumes  that  the  oscillation  amplitudes  are  sufficiently  small  to  ensure 
linearity. 

12.2  Two  coupled  linear  oscillators 

Consider  the  two-coupled  linear  oscillator,  shown  in  figure 
12.1,  which  comprises  two  identical  masses  each  connected  to 
fixed  locations  by  identical  springs  having  a force  constant 
k.  A spring  with  force  constant  n'  couples  the  two  oscilla- 
tors. The  equilibrium  lengths  of  the  outer  two  springs  are  l 
while  that  of  the  coupling  spring  is  l' . The  problem  is  simpli- 
fied by  restricting  the  motion  to  be  along  the  line  connecting 
the  masses  and  assuming  fixed  endpoints.  The  small  displace- 
ments of  mi  and  m2  are  taken  to  be  x\  and  X2  with  respect  to 
the  equilibrium  positions  l and  l + 11  respectively.  The  restor- 
ing force  on  m 1 is  —kxi—k'  (x\  — X2)  while  the  restoring  force 
on  m2  is  —kx 2 — k'  ( X2  — Xi) . This  coupled  double-oscillator 
system  exhibits  basic  features  of  coupled  linear  oscillator  sys- 
tems. 

Assuming  mi  = m2  = m,  then  the  equations  of  motion 
are 

mxi  + (k  + k')  xi  — k'x2  = 0 (12-1) 

mx 2 + (ft  + k!)  X2  — k'xi  = 0 

Assume  that  the  motion  for  these  coupled  equations  is  oscil- 


K m k'  m K 


R c 


Figure  12.1:  Two  coupled  linear  oscillators. 
The  equilibrium  spring-lengths  are  l for  the 
outer  springs  and  V for  the  coupling  spring. 
The  displacement  from  the  stable  locations 
are  given  by  x±  and  X2-  The  separation  be- 
tween the  two  masses  is  r and  the  location  of 
the  center-of-mass  is  Rcrn- 
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latory  with  a solution  of  the  form 


X!  = Bieiut 
x2  = B2eiut 


(12.2) 


(12.3) 


where  the  constants  B may  be  complex  to  take  into  account  both  the  magnitude  and  phase.  Substituting 
these  possible  solutions  into  the  equations  of  motion  gives 

-mLU2B1eiu’t  + (K  + n,)B1eiuit-K,B2ei“t  = 0 
—moj2B2eiut  + (re  + re')  B2eiut  - k! B\eiut  = 0 

Collecting  terms,  and  cancelling  the  common  exponential  fac- 
tor, gives 

(re  + re'  — mtu2)  B\  — re/ B2  = 0 (12-4) 

(re  + re'  — mw2)  B-2  — k' B\  = 0 

The  existence  of  a non-trivial  solution  of  these  two  simultane- 
ous equations  requires  that  the  determinant  of  the  coefficients  of 
Bi  and  B2  must  vanish,  that  is 


re  + re'  — tow2  —re' 

—re/  re  + «/  — tow2 

The  expansion  of  this  secular  determinant  yields 


(re  + re'  — tow2)“  — re'2  = 0 


Solving  for  w gives 


w = 


re  + re'  ± re' 


TO 


(12.5) 

(12.6) 

(12.7) 


That  is,  there  are  two  characteristic  frequencies  (or  eigenfrequen- 
cies)  for  the  system 


Wi  = 


W 2 


re  + 2re' 

TO 

re 

TO 


(12.8) 

(12.9) 


Figure  12.2:  Displacement  of  each  of  two 
Since  superposition  applies  for  these  linear  equations,  then  the  coupled  linear  harmonic  oscillators  with 
general  solution  can  be  written  as  a sum  of  the  terms  that  account  re  = 4 and  re'  = 1 in  relative  units, 
for  the  two  possible  values  of  w. 

Figure  12.2  shows  the  solutions  for  a case  where  re  = 4 and  re'  = 1,  in  arbitrary  units,  with  the  initial 
condition  that  x2  = D:  and  x\  = x\  = x2  = 0.  The  two  characteristic  frequencies  are  w i = J ^ and 


uj2  = J A.  The  characteristic  beats  phenomenon  is  exhibited  where  the  envelope  over  one  complete  cycle  of 
the  low  frequency  encompasses  several  higher  frequency  oscillations.  That  is,  the  solution  is 


x2  (t)  = j [eiuit  + e~iu>lt  + eiu,2t  + e~iu>2t] 


D cos 


while 


Xl  (f)  = j [eiuilt  + e~iuilt  - eiui2t  - e~iui2t]  =D  sin 


+CJ2 


£ jJ\  ~b  UJ2 


COS 


sm 


OJ 1 — UJ2 


CJl  — UJ2 


(12.10) 

(12.11) 


The  energy  in  the  two-coupled  oscillators  flows  back  and  forth  between  the  coupled  oscillators  as  illus- 
trated in  figure  12.2. 

A better  understanding  of  the  energy  flow  occurring  between  the  two  coupled  oscillators  is  given  by 
using  a (x\,x2)  configuration-space  plot,  shown  in  figure  12.3.  The  flow  of  energy  occurring  between  the  two 
coupled  oscillators  can  be  represented  by  choosing  normal-mode  coordinates  f]1  and  r/2  that  are  rotated  by 
45°  with  respect  to  the  spatial  coordinates  (xi,x2).  These  normal-mode  coordinates  (r]11ri2)  correspond  to 
the  two  normal  modes  of  the  coupled  double-oscillator  system. 
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12.3  Normal  modes 


The  normal  modes  of  the  two-coupled  oscillator  system  are 
obtained  by  a transformation  to  a pair  of  normal  coordinates 
(r]1,r]2)  that  are  independent  and  correspond  to  the  two  normal 
modes.  The  pair  of  normal  coordinates  for  this  case  are 

rq  = aq  — aq  (12.12) 

rj2  = aq  + aq 

that  is 

xi  = ^(v-2  + Vi)  (12.13) 

X2  = \ (%  - Vi) 

Substitute  these  into  the  equations  of  motion  (12.1),  gives 

m (Vi  + V2)  + (K  + 2k')  77 1 + k' 7]2  = 0 (12.14) 

m (Vi  - V2)  + («  + 2«0  Vi  - ^ri2  = 0 


Adding  and  subtracting  these  two  equations  gives 

mrq  + (k  + 2/t')  rq  = 0 (12.15) 

mr)2  + nr)2  = 0 

Note  that  the  two  coordinates  rq  and  rq  are  uncoupled  and  there- 
fore independent.  The  solutions  of  these  equations  are 


Figure  12.3:  Motion  of  two  coupled  har- 
monic oscillators  in  the  (aq,aq)  spatial 
configuration  space  and  in  terms  of  the 
normal  modes  (rq,rj2).  Initial  conditions 
are  aq  = D,x±  = = x2  = 0. 


rq(i)  = CfeiUlt  +C^e~iuit  (12.16) 

rq(f)  = C+eiU2t  +C2e~iU2t 


CO  = CO 


where  rq  corresponds  to  angular  frequencies  uq,  and  ?72  corresponds  to  u2.  The  two  coordinates  r]1  and  rj2  are 
called  the  normal  coordinates  and  the  two  solutions  are  the  normal  modes  with  corresponding 
angular  frequencies,  uq  and  u>2. 

The  (t)1,t]2)  axes  of  the  two  normal  modes  correspond  to  a 
rotation  of  45°  in  configuration  space,  figure  12.3.  The  initial 
conditions  chosen  correspond  to  rq  = — r/2  and  thus  both  modes 
are  excited  with  equal  intensity.  Note  that  there  are  5 lobes  along 
the  tj2  axis  versus  4 lobes  along  the  r]1  axis  reflecting  the  ratio 
of  the  eigenfrequencies  uq  and  ui2.  Also  note  that  the  diamond 
shape  of  the  motion  in  the  {x\,x2)  configuration  space  illustrates 
that  the  extrema  amplitudes  for  x2  are  a maximum  when  aq  is 
zero,  and  vise  versa.  This  is  equivalent  to  the  statement  that 
the  energies  in  the  two  modes  are  coupled  with  the  energy  for 
the  first  oscillator  being  a maximum  when  the  energy  is  a min- 
imum for  the  second  oscillator,  and  vise  versa.  By  contrast,  in 
the  (r]1,ri2)  configuration  space,  the  motion  is  bounded  by  a rec- 
tangle parallel  to  the  (rh,rj2)  axes  reflecting  the  fact  that  the 
extrema  amplitudes,  and  corresponding  energies,  for  the  rq  nor- 
mal mode  are  constant  and  independent  of  the  motion  for  the  rq 
normal  mode,  and  vise  versa.  The  decoupling  of  the  two  normal 
modes  is  best  illustrated  by  considering  the  case  when  only  one 
of  these  two  normal  modes  is  excited.  For  the  initial  conditions 
aq  (0)  = —x2  (0) , and  aq  (0)  = — aq  (0) , then  r]2  ( t ) = 0.  That  is, 

only  the  7q  (t)  normal  mode  is  excited  with  frequency  uq  which 

, , ,.  -re  mo  Figure  12.4:  Normal  modes  for  two  cou- 

corresponds  to  motion  confined  to  the  rq  axis  of  figure  12.3.  , , 

pled  oscillators. 
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As  shown  in  figure  12.4,  t]1  (t)  is  the  antisymmetric  mode  in  which  the  two  masses  oscillate  out  of  phase 
such  as  to  keep  the  center  of  mass  of  the  two  masses  stationary.  For  the  initial  conditions  X\  (0)  = X2  (0) , 
and  x\  (0)  = x2  (0) , then  r]1  (t)  = 0,  that  is,  only  the  ij2  (t)  normal  mode  is  excited.  The  r/2  (t)  normal  mode 
is  the  symmetric  mode  where  the  two  masses  oscillate  in  phase  with  frequency  u2;  it  corresponds  to  motion 
along  the  rj2  axis.  For  the  symmetric  phase,  both  masses  move  together  leading  to  a constant  extension  of 
the  coupling  spring.  As  a result  the  frequency  u)2  of  the  symmetric  mode  r/2  (t)  is  lower  than  the  frequency 
to i of  the  asymmetric  mode  r q (t) . That  is,  the  asymmetric  mode  is  stiffer  since  all  three  springs  provide 
active  restoring  forces,  compared  to  the  symmetric  mode  where  the  coupling  spring  is  uncompressed.  In 
general,  for  attractive  forces  the  lowest  frequency  always  occurs  for  the  mode  with  the  highest  symmetry. 


12.4  Center  of  mass  oscillations 

Transforming  the  coordinates  into  the  center  of  mass  of  the  two  oscillating  masses  elucidates  an  interesting 
feature  of  the  normal  modes  for  the  two-coupled  linear  oscillator.  As  illustrated  in  figure  12.1,  the  center- 
of-mass  coordinate  for  the  two  mass  system  is 

2 Rcrn  = l E x\  E l E l E x2  = 21  E l E r/2 

while  the  relative  separation  distance  is 

r = (l  El'  E x2)  — {l  E X\ ) — l'  — r/i 


That  is,  the  two  normal  modes  are 


Vi  = l'  - r 

rj2  = 2 Rcm  — 21  — 1 


(12.17) 


The  r]1  mode,  which  has  angular  frequency  u>i  = J corresponds  to  an  oscillations  of  the  relative 
separation  r,  while  the  center-of-mass  location  Rcm  is  stationary.  By  contrast,  the  rj2  mode,  with  angular 
frequency  u>2  = corresponds  to  an  oscillation  of  the  center  of  mass  Rcm  with  the  relative  separation  r 
being  a constant. 

Figure  12.5  illustrates  the  decoupled  center-of-mass 
Rcm,  and  relative  motions  r for  both  normal  modes  of 
the  coupled  double-oscillator  system.  The  difference  in 
angular  frequencies  and  amplitudes  is  readily  apparent. 

It  is  of  interest  to  consider  the  special  case  where  the 
spring  constant  k = 0 for  the  two  outside  springs.  Then 

the  angular  frequencies  are  lo\  = J and  to2  = 0 for 
the  two  normal  modes.  When  n — 0 the  rj2  mode  is  a 
spurious  center-of-mass  mode  since  it  corresponds  to  an 
oscillation  with  lo2  = 0 in  spite  of  the  fact  that  there 
are  no  forces  acting  on  the  center  of  mass.  That  is,  the 
center-of-mass  momentum  must  be  a constant  of  motion. 

This  spurious  center-of-mass  oscillation  is  a consequence 
of  measuring  the  displacements  ( x\,x2 ) with  respect  to 
an  arbitrary  external  reference  that  is  not  related  to  the 
center  of  mass  of  the  coupled  system.  Spurious  center- 
of-mass  modes  are  encountered  frequently  in  many-body 
coupled  oscillator  systems  such  as  molecules  and  nuclei. 

In  such  cases  it  is  necessary  to  project  out  the  center-of- 
mass  motion  to  eliminate  such  spurious  solutions  as  will 
be  discussed  later. 


Figure  12.5:  Time  dependence  of  the  center-of- 
mass  Rcm  and  relative  separation  r for  two  cou- 
pled linear  oscillators  assuming  spring  constants 
of  k = 4 M and  k'  = M. 
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12.5  Weak  coupling 

If  one  of  the  two  coupled  linear  oscillator  masses  is  held  fixed,  then  the  other  free  mass  will  oscillate  with  a 
frequency. 

The  effect  of  coupling  of  the  two  oscillators  is  to  split  the  degeneracy  of  the  frequency  for  each  mass  to 


UJi 


K + 2k' 


M 


> UJQ  ~ 


K + K' 

M 


> Ul  2 


(12.19) 


Thus  the  degeneracy  is  broken,  and  the  two  normal  modes  have  frequencies  straddling  the  single-oscillator 
frequency. 

It  is  interesting  to  consider  the  case  where  the  coupling  is  weak  because  this  situation  occurs  frequently 
in  nature.  The  coupling  is  weak  if  the  coupling  constant  k!  « k.  Then 


UJi 


K + 2k' 


M 


VTTte 


where 


Thus 


2 


Wl 


(1  + 2e) 


The  natural  frequency  of  a single  oscillator  was  shown  to  be 


w0 


K + K' 

M 


(12.20) 


(12.21) 

(12.22) 


(12.23) 


that  is 

y^  = w„(l-e)  (12.24) 

Thus  the  frequencies  for  the  normal  modes  for  weak  coupling 
can  be  written  as 


Wi 


(1  + 2 s) 


wo  (1  — e)  (1  + 2e) 


w0  (1  +e) 


(12.25) 


co„ 


ii—2 


co1 


co2 


while 


e) 


(12.26) 


That  is  the  two  solutions  are  split  equally  spaced  about  the 
single  uncoupled  oscillator  value  given  by  ojq  = J~L 


M ~ 

(1  + s).  Note  that  the  single  uncoupled  oscillator  fre- 
quency Wo  depends  on  the  coupling  strength  k' . 

This  splitting  of  the  characteristic  frequencies  is  a feature 
exhibited  by  many  systems  of  n identical  oscillators  where 
half  of  the  frequencies  are  shifted  upwards  and  half  down- 
ward. If  n is  odd,  then  the  central  frequency  is  unshifted  as 
illustrated  for  the  case  of  n = 3.  An  example  of  this  behav- 
ior is  the  Zeeman  effect  where  the  magnetic  field  couples  the 
atomic  motion  resulting  in  a hyperfine  splitting  of  the  energy 
levels  of  the  form  illustrated. 


CO 


0 


n=3 


CO -I 
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®3 


Figure  12.6:  Normal-mode  frequencies  for 
n=2  and  n=3  weakly- coupled  oscillators. 
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There  are  myriad  examples  involving  weakly-coupled  oscillators  in  many  aspects  of  the  natural  world. 
The  example  of  collective  modes  in  nuclear  physics,  illustrated  in  example  12.13,  is  typical  of  applications  to 
physics,  while  there  are  many  examples  applied  to  musical  instruments,  acoustics,  and  engineering.  Weakly- 
coupled  oscillators  are  a dominant  theme  throughout  biology  as  illustrated  by  congregations  of  synchronously 
flashing  fireflies,  crickets  that  chirp  in  unison,  an  audience  clapping  at  the  end  of  a performance,  networks 
of  pacemaker  cells  in  the  heart,  insulin-secreting  cells  in  the  pancreas,  and  neural  networks  in  the  brain  and 
spinal  cord  that  control  rhythmic  behaviors  such  as  breathing,  walking,  and  eating.  Synchronous  motion  of 
a large  number  of  weakly-coupled  oscillators  often  leads  to  large  collective  motion  of  weakly-coupled  systems 
as  discussed  in  chapter  12.12. 

12.1  Example:  The  Grand  Piano 


Schematic  diagram  of  the  action  for  a grand  piano,  including  the  strings,  bridge  and  sounding  board.  Note 
that  there  are  either  two  or  three  parallel  strings  per  note  all  hit  by  a single  hammer. 

The  grand  piano  provides  an  excellent  example  of  a weakly-coupled  harmonic  oscillator  system  that  has 
normal  modes.  There  are  either  two  or  three  parallel  strings  per  note  that  are  stretched  tightly  parallel  to  the 
top  of  the  horizontal  sounding  board.  The  strings  press  downwards  on  the  bridge  that  is  attached  to  the  top  of 
the  sounding  board.  The  strings  for  each  note  are  excited  when  struck  vertically  upwards  by  a single  hammer. 
In  the  base  section  of  the  piano  each  note  comprises  two  strings  tuned  to  nearly  the  same  frequency.  The 
coupling  of  the  motion  of  the  strings  is  via  the  bridge  plus  sounding  board.  Normally,  the  hammer  strikes  both 
strings  simultaneously  exciting  the  vertical  symmetric  mode,  not  the  vertical  antisymmetric  mode.  The  bridge 
is  connected  to  the  sounding  board  which  moves  the  largest  amount  for  the  symmetric  mode  where  both  strings 
move  the  bridge  in  phase.  This  strong  coupling  produces  a loud  sound.  The  antisymmetric  mode  does  not 
move  the  sounding  board  much  since  the  strings  at  the  bridge  move  out  of  phase.  Consequently,  the  symmetric 
mode,  that  is  strongly  coupled  to  the  sounding  board,  damps  out  more  rapidly  than  the  antisymmetric  mode 
which  is  weakly  coupled  to  the  sound  board  and  thus  has  a longer  time  constant  for  decay  since  the  radiated 
sound  energy  is  lower  than  the  symmetric  mode. 

The  una-corda  pedal  (soft  pedal)  for  a grand  piano  moves  the  action  sideways  such  that  the  hammer  strikes 
only  one  of  the  two  strings,  or  two  of  the  three  strings,  residting  in  both  the  symmetric  and  antisymmetric 
modes  being  excited  equally.  The  una-corda  pedal  produces  a characteristically  different  tone  than  when 
the  hammer  simultaneously  hits  all  the  strings;  that  is,  it  produces  a smaller  transient  component.  The 
symmetric  mode  rapidly  damps  due  to  energy  propagation  by  the  sounding  board.  Thus  the  longer  lasting 
antisymmetric  mode  becomes  more  prominent  when  both  modes  are  equally  excited  using  the  una-corda  pedal. 
The  symmetric  and  antisymmetric  modes  have  slightly  different  frequencies  and  produce  beats  which  also 
contributes  to  the  different  timbre  produced  using  the  una-corda  pedal.  For  the  mid  and  upper  frequency 
range,  the  piano  has  three  strings  per  note  which  have  one  symmetric  mode  and  two  separate  antisymmetric 
modes.  To  further  complicate  matters,  the  strings  also  can  oscillate  horizontally  which  couples  weakly  to  the 
bridge  plus  sounding  board.  The  strengths  that  these  different  modes  are  excited  depend  on  subtle  differences 
in  the  shape  and  roughness  of  the  hammer  head  striking  the  strings.  Primarily  the  hammer  excites  the  two 
vertical  modes  rather  than  the  horizontal  modes. 
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12.6  General  analytic  theory  for  coupled  linear  oscillators 


The  above  discussion  of  a coupled  double-oscillator  system  has  shown  that  it  is  possible  to  select  symmetric 
and  antisymmetric  normal  modes  that  are  independent  and  each  have  characteristic  frequencies.  The  normal 
coordinates  for  these  two  normal  modes  correspond  to  linear  superpositions  of  the  spatial  amplitudes  of  the 
two  oscillators  and  can  be  obtained  by  a rotation  into  the  appropriate  normal  coordinate  system.  Extension 
of  this  to  systems  comprising  n coupled  linear  oscillators,  requires  development  of  a general  analytic  theory, 
that  is  capable  of  finding  the  normal  modes  and  their  eigenvalues  and  eigenvectors.  As  illustrated  for  the 
double  oscillator,  the  solution  of  many  coupled  linear  oscillators  is  a classic  eigenvalue  problem  where  one  has 
to  rotate  to  the  principal  axis  system  to  project  out  the  normal  modes.  The  following  discussion  presents  a 
general  approach  to  the  problem  of  finding  the  normal  coordinates  for  a system  of  n coupled  linear  oscillators. 

Consider  a conservative  system  of  n coupled  oscillators,  described  in  terms  of  generalized  coordinates 
qk  and  t,  with  subscript  k = 1,2,3 ...n  for  a system  with  n degrees  of  freedom.  The  coupled  oscillators  are 
assumed  to  have  a stable  equilibrium  with  generalized  coordinates  qk o at  equilibrium.  In  addition,  it  is 
assumed  that  the  oscillation  amplitudes  are  sufficiently  small  to  ensure  that  the  system  is  linear. 

For  the  equilibrium  position  qk  = qko  the  Lagrange  equations  must  satisfy 


qk  = 0 (12.27) 

Qk  = 0 


Every  non-zero  term  of  the  form  in  Lagrange’s  equations  must  contain  at  least  either  qk  or  cjk  which 

are  zero  at  equilibrium;  thus  all  such  terms  vanish  at  equilibrium.  At  equilibrium 


(12.28) 


where  the  subscript  0 designates  at  equilibrium. 


12.6.1  Kinetic  energy  tensor  T 

In  chapter  7.6  it  was  shown  that,  in  terms  of  fixed  rectangular  coordinates,  the  kinetic  energy  for  N bodies, 
with  n generalized  coordinates,  is  expressed  as 


1 N 3 

T = 2 X] 

O'— 1 i— 1 


(12.29) 


Expressing  these  in  terms  of  generalized  coordinates  xaj  = xaii(qj,t)  where  j = 1, 2,  ...n,  then  the  generalized 
velocities  are  given  by 


Xoc,i  — ^ ^ 


dxa 


dq.j 


Aj 


dxa 


dt 


i-i 

As  discussed  in  chapter  7.6,  if  the  system  is  scleronomic  then  the  partial  derivative 


(12.30) 


dt 


= 0 


(12.31) 


Thus  the  kinetic  energy,  equation  12.29,  of  a scleronomic  system  can  be  written  as  a homogeneous  quadratic 
function  of  the  generalized  velocities 

1 n 

T = - ^ ' Tjkq:jqk  (12.32) 

j,k 

where  the  components  of  the  kinetic  energy  tensor  T are 


N 


T\ 


jk 


dxa^i  Oxa.i 
dqj  dqk 


(12.33) 


Note  that  if  the  velocities  q correspond  to  translational  velocity,  then  the  kinetic  energy  tensor  T corresponds 
to  an  effective  mass  tensor,  whereas  if  the  velocities  correspond  to  angular  rotational  velocities,  then  the 
kinetic  energy  tensor  T corresponds  to  the  inertia  tensor. 
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It  is  possible  to  make  an  expansion  of  the  Tjk  about  the  equilibrium  values  of  the  form 

(dT  \ 

~d<n)  ® + "' 


(12.34) 


Only  the  first-order  term  will  be  kept  since  the  second  and  higher  terms  are  of  the  same  order  as  the  higher- 
order  terms  ignored  in  the  Taylor  expansion  of  the  potential.  Thus,  at  the  equilibrium  point,  assume  that 
(j^l)  = 0 where  k = 1, 2, 3,  ...n. 


12.6.2  Potential  energy  tensor  V 

Equations  12.28  plus  12.34  imply  that 


3U 

dqk 


= 0 


(12.35) 


where  k = 1, 2, 3,  ...n. 

Make  a Taylor  expansion  about  equilibrium  for  the  potential  energy,  assuming  for  simplicity  that  the 
coordinates  have  been  translated  to  ensure  that  qk  = 0 at  equilibrium.  This  gives 

U (Ql,q2,  ..qn)  = U0  + £ + (ajyy  Q q:>Clk  + " (12'36) 

The  linear  term  is  zero  since  (^§^j  = 0 at  the  equilibrium  point,  and  without  loss  of  generality,  the 

potential  can  be  measured  with  respect  to  U$.  Assume  that  the  amplitudes  are  small,  then  the  expansion 
can  be  restricted  to  the  quadratic  term,  corresponding  to  the  simple  linear  oscillator  potential 


That  is 


1 / ff2  U \ 1 

U (qi,q2,  ■■ qn ) - U0  = U'  (<?i,  q2,  ..qn)  = - ^ f j qjQk  = ^^Vjkqjqk 

j,k  ^ ^ ^0  j]^ 

U'  {qi,  52,  •• qn ) = \ ^2  Vjkqjqk 


j,k 


where  the  components  of  the  potential  energy  tensor  V are  defined  as 

d2U^_\ 


Vjk  - 


dqj  dqk  J ( 


Note  that  the  order  of  differentiation  is  unimportant  and  thus  the  quantity  Vjk  is  symmetric 

Vjk  = Vkj 


(12.37) 


(12.38) 


(12.39) 


(12.40) 


The  motion  of  the  system  has  been  specified  for  small  oscillations  around  the  equilibrium  position  and 
it  has  been  shown  that  U'  (gi,  q2,  ...qn)  has  a minimum  value  at  equilibrium  which  is  taken  to  be  zero  for 
convenience. 

In  conclusion,  equations  (12.32)  and  (12.38)  give 

T = I " 

2 


Tjkijqk 

j,k 

(12.41) 

n 

Vjkqjqk 

j,k 

(12.42) 

u'  - 5 

where  the  components  of  the  kinetic  energy  tensor  T and  potential  energy  tensor  V are 

/ N 3 


Tjk  — ( ^2  ma  ^2 

\ a i 

= ( d2U’  \ 

\dqjdqh)  o 


dqj  dqk 


(12.43) 

(12.44) 
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Note  that  qj  and  qk  may  have  different  units,  but  all  the  terms  in  the  summations  for  both  T and  U' , have 
units  of  energy.  The  Vjk  and  Tjk  values  are  evaluated  at  the  equilibrium  point,  and  thus  both  Vjk  and  Tjk 
are  n x n arrays  of  values  evaluated  at  the  equilibrium  location. 


12.6.3  Equations  of  motion 


Both  the  kinetic  energy  and  potential  energy  terms  are  products  of  the  coordinates  leading  to  a set  of 
coupled  equations  that  are  complicated  to  solve.  The  problem  is  greatly  simplified  by  selecting  a set  of 
normal  coordinates  for  which  both  T and  U are  diagonal,  then  the  coupling  terms  disappear.  Thus  a 
coordinate  transformation  must  be  found  that  simultaneously  diagonalizes  T:/k  and  Vjk  in  order  to  obtain  a 
set  of  normal  coordinates. 

The  kinetic  energy  T is  only  a f unction  of  generalized  velocities  qk  while  the  conservative  potential  energy 
is  only  a function  of  the  generalized  coordinates  qk.  Thus  the  Lagrange  equations 


dL  d dL 

3qk  dt  dqk 

(12.45) 

reduce  to 

dU  d dT 

(12.46) 

dqk  dt  dqk 

But 

WV-t— 

(12.47) 

and 

dT  ^ . 

dqk~\Tjkqj 

(12.48) 

Thus  the  Lagrange  equations  reduce  to  the  following  set  of  equations  of  motion, 


'y~'.  {VjkQj  + Tjkqj)  — 0 (12.49) 

i 

For  each  k,  where  1 < k < n,  there  exists  a set  of  n second-order  linear  homogeneous  differential  equations 
with  constant  coefficients.  Since  the  system  is  oscillatory,  it  is  natural  to  try  a solution  of  the  form 

q3{t)  = ajei(ut~s)  (12.50) 

Assuming  that  the  system  is  conservative,  then  this  implies  that  u is  real,  since  an  imaginary  term  for  oj 
would  lead  to  an  exponential  damping  term.  The  arbitrary  constants  are  the  real  amplitude  aj  and  the 
phase  S.  Substitution  of  this  trial  solution  for  each  k leads  to  a set  of  equations 


YJ(yjk-^Tjk)aj=  0 (12.51) 

3 

where  the  common  factor  has  been  removed.  Equation  12.51  corresponds  to  a set  of  n linear 

homogeneous  algebraic  equations  that  the  a3  amplitudes  must  satisfy  for  each  k.  For  a non-trivial  solution 
to  exist,  the  determinant  of  the  coefficients  must  vanish,  that  is 

I'll  — W2Xii  V\2  — W2Xi2  Vl3 

V\2  — W2Xi2  V22  — U12T22  V23 

V13  — W2Tl3  V23  — U)2T23  V33 


a)  T13 
r-23 

^33 


■ w2T2: 
u2Tx 


= 0 


(12.52) 


where  the  symmetry  Vjk  = Vk j has  been  included.  This  is  the  standard  eigenvalue  problem  for  which 
the  above  determinant  gives  the  secular  equation  or  the  characteristic  equation.  It  is  an  equation 
of  degree  n in  co2.  The  n roots  of  this  equation  are  co2  where  u)r  are  the  characteristic  frequencies  or 
eigenfrequencies  of  the  normal  modes. 

Substitution  of  u2  into  equation  12.52  determines  the  ratio  ai>r  : a2,r  : 03, r : ...  : a„>r  for  this  solution 
which  defines  the  components  of  the  n-dimensional  eigenvector  ar.  That  is,  solution  of  the  secular  equations 
have  determined  the  eigenvalues  and  eigenvectors  of  the  n solutions  of  the  coupled-channel  system. 
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12.6.4  Superposition 

The  equations  of  motion  J2j  (R jkQj  + Tj^qj)  = 0 are  linear  equations  that  satisfy  superposition.  Thus  the 
most  general  solution  q.j  (t)  can  be  a superposition  of  the  n eigenvectors  a,,.,  that  is 

n 

qj  (t)  = ajrei(“rt-5r)  (12.53) 

r 

Only  the  real  part  of  Qj  ( t ) is  meaningful,  that  is, 

n n 

qj  (t)  = Re  ^ ajrel^rt-Sr^  = ^ ajr  cos  (urt  — Sr)  (12.54) 

r r 

Thus  the  most  general  solution  of  these  linear  equations  involves  a sum  over  the  eigenvectors  of  the 
system  which  are  cosine  functions  of  the  corresponding  eigenfrequencies. 


12.6.5  Eigenfunction  ort honor mality 

It  can  be  shown  that  the  eigenvectors  are  orthogonal.  In  addition,  the  above  procedure  only  determines  ratios 
of  amplitudes,  thus  there  is  an  indeterminacy  that  can  be  used  to  normalize  the  ajjr . Thus  the  eigenvectors 
form  an  orthonormal  set.  Orthonormality  of  the  eigenfunctions  for  the  rank  3 inertia  tensor  was  illustrated 
in  chapter  11.10.2.  Similar  arguments  apply  that  allow  extending  orthonormality  to  higher  rank  cases  such 
that  for  n-body  coupled  oscillators. 

The  eigenfunction  orthogonality  for  n coupled  oscillators  can  be  proved  by  writing  equation  12.51 
for  both  the  sth  root  and  the  rth  root.  That  is, 


y ' VjkQ'ks 

i 

VjkUjr 

3 


m s y ' 

3 

U>r  Tjkdjr 

3 


(12.55) 

(12.56) 


Multiply  equation  12.55  by  djr  and  sum  over  k.  Similarly  multiply  equation  12.56  by  dks  and  sum  over  k. 
These  summations  lead  to 


y ^ Ujk&jrClks 
jk 

y ^ UjkCLjrClks 

jk 


id s y ( Tjkajraks 

jk 

id2r  E Ujk^kjr^ks 

jk 


(12.57) 

(12.58) 


Note  that  the  left-hand  sides  of  these  two  equations  are  identical.  Thus  taking  the  difference  between  these 
equations  gives 

(idl  - i d2s)  ^2  Tjkajraks  = 0 (12.59) 

jk 

Note  that  if  ( u )2  — u^)  7^  0,  that  is,  assuming  that  the  eigenfrequencies  are  not  degenerate,  then  to  ensure 
that  equation  12.59  is  zero  requires  that 


^Tjkajraks  =0  r ^ s (12.60) 

jk 

This  shows  that  the  eigenfunctions  are  orthogonal.  If  the  eigenfrequencies  are  degenerate,  i.e.  u>2  = id2, 
then,  with  no  loss  of  generality,  the  axes  r and  s can  be  chosen  to  be  orthogonal. 

The  eigenfunction  normalization  can  be  chosen  freely  since  only  ratios  of  the  eigenfunction  compo- 
nents djr  are  determined  when  u>r  is  used  in  equation  12.51.  The  kinetic  energy,  given  by  equation  12.32 
must  be  positive,  or  zero  for  the  case  of  a static  system.  That  is 

1 n 

T = — Tjkc[jC[k  U 0 
j,k 


(12.61) 
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Use  the  time  derivative  of  equation  12.54  to  determine  qr  and  insert  into  equation  12.61  gives  that  the  kinetic 
energy  is 

n i n 

T — ^ ^ TjkQjQk  — X ^ ^ ^ ^ UJrCd sCljr  COS  (c Urt  $r)  & ks  COS  (c 0 st  (12.62) 


2 2 

j,k  j,k  r,s 

For  the  diagonal  term  r = s 

^ n " i n 

T = -^Tjkqjqk  = - ^ u>l  cos2  (urt  - 5r) 

i,fc  L r 

Since  the  term  in  the  square  brackets  must  be  positive,  then 

^ ' Tjk&jr&kr  A 0 


y ' Tjkdjr^kr  A 0 
j.k 


(12.63) 


(12.64) 


Since  this  sum  must  be  a positive  number,  and  the  magnitude  of  the  amplitudes  can  be  chosen  freely,  then 
it  is  possible  to  normalize  the  eigenfunction  amplitudes  to  unity.  That  is,  choose  that 


y ' TjkdjrClks  1 
j,k 


(12.65) 


The  orthogonality  equation,  12.60  and  the  normalization  equation  12.65  can  be  combined  into  a single 
orthonormalization  equation 


y Tjkajro.ks  — &rs  (12.66) 

j,k 

This  has  shown  that  the  eigenvectors  form  an  orthonormal  set. 

Since  the  jth  component  of  the  rth  eigenvector  is  cijr,  then  the  rth  eigenvector  can  be  written  in  the  form 

ar  = ^ djrej  (12.67) 

i 

where  e}  are  the  unit  vectors  for  the  generalized  coordinates. 

12.6.6  Normal  coordinates 

The  above  general  solution  of  the  coupled-oscillator  problem  is  best  expressed  in  terms  of  the  normal  coor- 
dinates which  are  independent.  It  is  more  transparent  if  the  superposition  of  the  normal  modes  are  written 
in  the  form 

n 

Qj  ft)  = F PrajrelUrt  (12.68) 

r 

where  the  complex  factor  /3r  includes  the  arbitrary  scale  factor  to  allow  for  arbitrary  amplitudes  q-j  as  well 
as  the  fact  that  the  amplitudes  ajr  have  been  normalized  and  the  phase  factor  Sr  has  been  chosen. 

Define 

Vr  ( t ) = /3re— 4 (12.69) 

then  equation  12.68  can  be  written  as 

n 

Qj(t)  ='y2ajrVr(t)  (12.70) 

r 

Equation  12.70  can  be  expressed  schematically  as  the  matrix  multiplication 

q = {a}  ' V (12.71) 

The  ?7r  (t)  are  the  normal  coordinates  which  can  be  expressed  in  the  form 

r?={a}-1q  (12.72) 

Each  normal  mode  r/r  corresponds  to  a single  eigenfrequency,  cjr  which  satisfies  the  linear  oscillator  equation 

i)r+w^?7r  = 0 (12.73) 
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12.7  Two-body  coupled  oscillator  systems 

The  two-body  coupled  oscillator  is  the  simplest  coupled-oscillator  system  that  illustrates  the  general  fea- 
tures of  coupled  oscillators.  The  following  four  examples  involve  parallel  and  series  couplings  of  two  linear 
oscillators  or  two  plane  pendula. 

12.2  Example:  Two  coupled  linear  oscillators 

The  coupled  double-oscillator  problem,  figure  12.1  discussed  in  chapter  12.2,  can  be  used  to  demonstrate 
that  the  general  analytic  theory  gives  the  same  solution  as  obtained  by  direct  solution  of  the  equations  of 
motion  in  chapter  12.2. 

1 ) The  first  stage  is  to  determine  the  potential  and  kinetic  energies  using  an  appropriate  set  of  generalized 
coordinates,  which  here  are  x\  and  x 2.  The  potential  energy  is 

U = -kx\  + -nx\  + -k'  (x2  — x\ )2  = - (k  + k')  x\  + - (k  + k')  x\  — k’x\X2 

Z Z Z Z Z 

while  the  kinetic  energy  is  given  by 

T = -Mi?  + -Mil 
2 1 2 2 

2)  The  second  stage  is  to  evaluate  the  potential  energy  V and  kinetic  energy  T tensors.  The  potential 
energy  tensor  V is  nondiagonal  since  Vjk  gives 


Vn  = 

V12  = 

That  is,  the  potential  energy  tensor  V is 

V = 

Similarly,  the  kinetic  energy  is  given  by 


( 02U  \ , 

( 71 — 7i — ) — n + k — V22 

\dqidqij  0 


f d°-u  \ 

\dq1dq2)( 


= -k’  = Vo 


21 


K + k'  —k! 

— K1  K + K' 


T = \m±\  + \m±1  = ]-  TjkQjQk 


j,k 


Since  Tu  = T2 2 = M and  Ti2  = T21  = 0 then  the  kinetic  energy  tensor  T is 


T = 


M 0 
0 M 


Note  that  for  this  case,  the  kinetic  energy  tensor  T equals  the  mass  tensor,  which  is  diagonal,  whereas  the 
potential  energy  tensor  equals  the  spring  constant  tensor,  which  is  nondiagonal. 

3)  The  third  stage  is  to  use  the  potential  energy  V and  kinetic  energy  T tensors  to  evaluate  the  secular 
determinant  using  equations  12.52 


k + k'  — Map1  —k' 

—k'  k + k'-Muj2 


The  expansion  of  this  secular  determinant  yields 

(k  + k'  — Mio 2)2  — k'2  = 0 


That  is 


(k  + k'  — Mix2)  = ±/t7 


12.7.  TWO-BODY  COUPLED  OSCILLATOR  SYSTEMS 


353 


Solving  for  ur  gives 


The  solutions  are 


ur  = 


k + k'Ek' 


M 


which  is  the  same  as  derived  previously,  (equations  12.7  — 9). 

4 ) The  fourth  step  is  to  insert  either  one  of  these  eigenfrequencies  into  the  secular  equation 


(Vjfc  wr^ifc)  ajr  — 0 

j 

Consider  the  secular  equation  a for  k = 1 

(k  + k!  — u^M)  air  — n! a^r  = 0 

Then  for  the  first  eigenfrequency  u\,  that  is,  k = 1,  r = 1 

(k  + k!  — k — 2k')  an  — n'a-21  = 0 


which  simplifies  to 


Ojr  — an  — —a  21 

Similarly,  for  the  other  eigenfrequency  u 2,  that  is,  k = l,r  = 2 


(«  + «/  — K)  ai2  - k' 022  = 0 


(a) 


which  simplifies  to 

Ojr  = Oi2  = 022 

5,)  The  final  stage  is  to  write  the  general  coordinates  in  terms  of  the  normal  coordinates  rjr  (t)  = 
firelulrt.  Thus 

xi  = anrq  + ai2g2  = anVi  + a22?72 


and 


X2  = 0,2  It)  1 + 022112  = — aii?7i  + O22V2 


Adding  or  subtracting  gives  that  the  normal  modes  are 


Vi 

V2 


1 

2an 


(xi  - x2) 


1 

2«22 


(X2+X1) 


Thus  the  symmetric  normal  mode  rj2  corresponds  to  an  oscillation  of  the  center- of -mass  with  the  lower 
frequency  U2  = This  frequency  is  the  same  as  for  one  single  mass  on  a spring  of  spring  constant 

k which  is  as  expected  since  they  vibrate  in  unison  and  thus  the  coupling  spring  force  does  not  act.  The 

antisymmetric  mode  has  the  higher  frequency  aq  = J since  the  restoring  force  includes  both  the 
main  spring  plus  the  coupling  spring. 

The  above  example  illustrates  that  the  general  analytic  theory  for  coupled  linear  oscillators  gives  the 
same  answer  as  obtained  in  chapter  12.2  using  Newton’s  equations  of  motion.  However,  the  general  analytic 
theory  is  a more  powerful  technique  for  solving  complicated  coupled  oscillator  systems.  Thus  the  general 
analytic  theory  will  be  used  for  solving  all  the  following  coupled  oscillator  problems. 
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12.3  Example:  Two  equal  masses  series- coupled  by  two  equal  springs 

Consider  the  series- coupled  system  shown  in  the  figure. 

1 ) The  first  stage  is  to  determine  the  potential  and  kinetic 
energies  using  an  appropriate  set  of  generalized  coordinates, 
which  here  are  x\  and  x 2.  The  potential  energy  is 


U = 7jKxl  + (x2  — xl)2  = S,x\  + ^KX 2 ^ KX1X2 


while  the  kinetic  energy  is  given  by 


T = -Mil  + -Mx\ 
2 2 


Two  equal  masses  series-coupled  by  two 
equal  springs. 


2)  The  second  stage  is  to  evaluate  the  potential  energy  V and  mass  T tensors.  The  potential  energy  tensor 
V is  nondiagonal  since  Vjk  gives 


Vn  = 
V12  = 

^22  = 

That  is,  the  potential  energy  tensor  V is 


f d2u  \ 

V dq  \ dq  t ) ( 

/ d2u  \ 

\dq1dq2)( 

f d2u  \ 

V dq2dq2)  r 


= 2 K 

= ~K  = V21 


= K 


V = 


2 K —K 

— K K 


Similarly,  since  the  kinetic  energy  is  given  by 


T = i Mil  + \Mxl  = 

j,k 

then  Tu  = T22  = M and  T12  = T2i  = 0.  Thus  the  kinetic  energy  tensor  T is 


T = 


M 0 
0 M 


Note  that  for  this  case  the  kinetic  energy  tensor  is  diagonal  whereas  the  potential  energy  tensor  is  nondiagonal. 

3)  The  third  stage  is  to  use  the  potential  energy  V and  kinetic  energy  T tensors  to  evaluate  the  secular 
determinant  using  equation  12.52 

2n  — Moj2  —k 
—k  k — Mui2 


= 0 


The  expansion  of  this  secular  determinant  yields 

(2 k — Mui 2)  (k  — Mu2)  — k2  = 0 

That  is 


u4  - 3—u2  + — = 0 


M 


M'2 


The  solutions  are 


Ui  = 


V5+1 


u2  = 


V5-  1 


4)  The  fourth  step  is  to  insert,  these  eigenfrequencies  into  the  secular  equation  12.51 


'y'',  (ijfc  W rTjk ) &jr  — 0 

j 
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Consider  k = 1 in  the  above  equation 


(2k  — OJ^M)  air  — K02r  = 0 
Then  for  eigenfrequency  oj\,  that  is,  k = 1,  r = 1 

Vb-l 

— 2 — °n  — — a21 

Similarly,  for  k = 1,  r = 2 

V5+1 

— - — a 12  = a22 

5,)  77ie  final  stage  is  to  write  the  general  coordinates  in  terms  of  the  normal  coordinates  rjr  (t)  = 
/3reioJrt. 

Thus 

2a-22 

xi  = auiR  + ai2»72  = auVi  + — 7V2 

V5  + 1 


and 

X2  = a2ir]l  + «22??2  = ^ ^ 
Adding  or  subtracting  gives  that  the  normal  modes  are 

1 / 


y/E-  1 


auVi  + (122V2 


Vi  = 

auV5 

1 ( 

92  = 

a22y5  [Xl  + 

'y/5-1 

2 

'y/5  + 1 


X2 


X2 


Thus  the  symmetric  normal  mode  has  the  lower  frequency  u> 2 = 2 1 \pM-  The  antisymmetric  mode  has  the 

frequency  cji  = v^+1  yPjf  since  both  springs  provide  the  restoring  force.  This  case  is  interesting  in  that  for 
both  normal  modes,  the  amplitudes  for  the  motion  of  the  two  masses  are  different. 


12.4  Example:  Two  parallel- coupled  plane  pendula 


Consider  the  coupled  double  pendulum  system  shown  in 
the  adjacent  figure,  which  comprises  two  parallel  plane  pen- 
dida  weakly  coupled  by  a spring.  The  angles  9\  and  O2  are 
chosen  to  be  the  generalized  coordinates  and  the  potential  en- 
ergy is  chosen  to  be  zero  at  equilibrium.  Then  the  kinetic 
energy  is 

T = -m  [bO ij  + -to  [be2J 

As  discussed  in  chapter  4,  it  is  necessary  to  make  the  small- 
angle  approximation  in  order  to  make  the  equations  of  motion 
for  the  simple  pendulum  linear  and  solvable  analytically.  That 
is, 


liiininnuiMininuik 


Two  parallel-coupled  plane  pendula. 


1 9 

U = mgb  (1  — cos$i)  + mgb  (1  — cos02)  + -k  (6sin0i  — &sin02)“ 
mgb  nb2 


[o\  + el) + ^{6,-62) 


assuming  the  small  angle  approximation  sin 9 « 9 and  (1  — cos#i)  = 

The  second  stage  is  to  evaluate  the  kinetic  energy  T and  potential  energy  V tensors 


J mb2  0 \ _ J mgb  + nb2  —nb2  1 

\ 0 mti2  J ( — nb2  mgb  + nb 2 J 
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Note  that  for  this  case  the  kinetic  energy  tensor  is  diagonal  whereas  the  potential  energy  tensor  is  nondiagonal. 
The  third  stage  is  to  evaluate  the  secular  determinant 


mgb  + nb2  — u2mb2  —Kb2 

—Kb2  m.gb  + nb2  — u>2mb2 


= 0 


which  gives  the  characteristic  equation 


( mgb  + nb2  — w2mb 2)2  = [nb2 


or 


mg  + nb  — w2mb  = ±k6 


The  two  solutions  are 


2 K 
TO 


The  fourth  step  is  to  insert  these  eigenfrequencies  into  equation  12.51 


'y~'.  {Yjk  w rTjk ) O-jr  — 0 

j 


Consider  k = 1 


(m.gb  + nb 2 — ui2mb 2)  a \r  — Kb2a2r  = 0 
Then  for  the  first,  eigenfrequency,  oj i,  the  subscripts  are  k = l,r  = 1 

(mgb  + nb 2 — au  — Kb2a2i  = 0 


which  simplifies  to 
Similarly,  for  k = 1,  r = 2 


an  = a 2i 


mgb  + nb2 


f + 

b to  / 


ai2  - Kb~a22  = 0 


which  simplifies  to 


a 12  = —a  22 


The  final  stage  is  to  write  the  general  coordinates  in  terms  of  the  normal  coordinates 


Oi  = ang1  + ai2??2  = - a22g2 


and 


02  — a.21??l  + 0,22112  — aii  Vi  + O22V2 


Adding  or  subtracting  these  equations  gives  that  the  normal  modes  are 


Vi  = 


1 

2a.ii 


(6*i  + 9 2) 


V2  = 


1 

2a22 


(02  -0l) 


^4s  for  the  case  of  the  double  oscillator  discussed  in  example  12.2,  the  symmetric  normal  mode  corresponds 
to  an  oscillation  of  the  center- of -mass,  with  zero  relative  motion  of  the  two  pendula,  which  has  the  lower 
frequency  u>  1 = -*/|.  This  frequency  is  the  same  as  for  one  independent  pendulum  as  expected  since  they 
vibrate  in  unison  and  thus  the  only  restoring  force  is  gravity.  The  antisymmetric  mode  corresponds  to 

relative  motion  of  the  two  pendula  with  stationary  center- of -mass  and  has  the  frequency  w2  = J ( f + 
since  the  restoring  force  includes  both  the  coupling  spring  and  gravity. 

This  example  introduces  the  role  of  degeneracy  which  occurs  in  this  system  if  the  coupling  of  the  pendula 
is  zero,  that  is,  k = 0,  leading  to  both  frequencies  being  equal,  i.e.  w>\=  w>2  = When  k = 0,  then  both 
m and  {V}  are  diagonal  and  thus  in  the  (61,62)  space  the  two  pendula  are  independent,  normal  modes. 
However,  the  symmetric  and  asymmetric  normal  modes,  as  derived  above,  are  equally  good  normal  modes. 
In  fact,  since  the  modes  are  degenerate,  any  linear  combination  of  the  motion  of  the  independent  pendida  are 
equally  good  normal  modes  and  thus  one  can  use  any  set  of  orthogonal  normal  modes  to  describe  the  motion. 
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12.5  Example:  The  series- coupled  double  plane  pendula 

The  double-pendula  system  comprises  one  plane  pendulum  attached 
to  the  end  of  another  plane  pendulum  both  oscillating  in  the  same  plane. 

The  kinetic  and  potential  energies  for  this  system  are  given  in  example  ' 

6.23  to  be 

1 . 2 . 1*2 

T = -(mi  + m2)L\(j)1  + m2L1L2(t)1(j)2  cos((f>1  - 02)  + -^m2L\(f)2 
U = (mi  + m2)gLi(l  — cos  fa)  + m2gL2(l  — cos  (j>2) 

a)  Small-amplitude  linear  regime 

Use  of  the  small-angle  approximation  makes  this  system  linear  and 
solvable  analytically.  That  is,  T and  U become 

Two  series-coupled  plane  pendula. 

U = -(mi  + m2)gL1(j)21  + -m2gL2(f>l 

1 *2  • • 1 *2 

T = -(mi  +m2)L21f)1  + m2L1L2(j)1(t)2  + -m2I-2^2 

Thus  the  kinetic  energy  and  potential  energy  tensors  are 


( (mi+m2)L\  m2L1L2  1 
\ m2LiL2  m2L2  J 


V = 


(mi  + m2)gL\ 
0 


° 1 
m2gL2  J 


Note  that  T is  nondiagonal,  whereas  V is  diagonal  which  is  opposite 
to  the  case  of  the  two  parallel- coupled  plane  pendida. 

The  solution  of  this  case  is  simpler  if  it  is  assumed  that  Lx  = L2  = L 
and  = m2  = m.  Then 


T = mL1 


2 1 
1 1 


V = mL2 


2o)q  0 
0 U)  n 


where  u> o = a/J  which  is  the  frequency  of  a single  pendulum. 
The  next  stage  is  to  evaluate  the  secular  determinant 


mL " 


2(wq  — to2)  —oj2 


-ur 


■ or 


= 0 


The  eigenvalues  are 

w?  = (2  - V2). 


w20 


— (2  + 


As  shown  in  the  adjacent  figure , the  normal  modes  for  this  system 


are 


’h  - 2 + Td 


_ 1 / , 02  \ 

^2  - 7T~  (01  ~7=) 


2a22 


series-coupled  plane  pendula. 


The  second  mass  has  a \/2  larger  amplitude  that  is  in  phase  for  solution  1 and  out  of  phase  for  solution  2. 

b)  Large  amplitude  chaotic  regime 

Stachowiak  and  Okada  [Sta05]  used  computer  simidations  to  numerically  analyze  the  behavior  of  this 
system  with  increase  in  the  oscillation  amplitudes.  Poincare  sections,  bifurcation  diagrams,  and  Lyapunov 
exponents  all  confirm  that  this  system  evolves  from  regular  normal-mode  oscillatory  behavior  in  the  linear 
regime  at  low  energy,  to  chaotic  behavior  at  high  excitation  energies  where  non-linearity  dominates.  This 
behavior  is  analogous  to  that  of  the  driven,  linearly-damped,  harmonic  pendulum  described  in  chapter  4.5 
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12.8  Three-body  coupled  linear  oscillator  systems 

Chapter  12.7  discussed  parallel  and  series  arrangements  of  two  coupled  oscillators.  Extending  from  two  to 
three  coupled  linear  oscillators  introduces  interesting  new  characteristics  of  coupled  oscillator  systems.  For 
more  than  two  coupled  oscillators,  coupled  oscillator  systems  separate  into  two  classifications  depending  on 
whether  each  oscillator  is  coupled  to  the  remaining  n — 1 oscillators,  or  when  the  coupling  is  only  to  the 
nearest  neighbors  as  illustrated  below. 


12.6  Example:  Three  plane  pendula;  mean-field  linear  coupling 


Consider  three  identical  pendula  with  mass  m and  length 
b,  suspended  from  a common  support  that  yields  slightly  to 
pendulum  motion  leading  to  a coupling  between  all  three  pen- 
dida  as  illustrated  in  the  adjacent  figure.  Assume  that  the 
motion  of  the  three  pendula  all  are  in  the  same  plane.  This 
case  is  analogous  to  the  piano  where  three  strings  in  the  tre- 
ble section  are  coupled  by  the  slightly-yielding  common  bridge 
plus  sounding  board  leading  to  coupling  between  each  of  the 
three  coupled  oscillators.  This  case  illustrates  the  important 
concept  of  degeneracy. 

The  generalized  coordinates  are  the  angles  61,62,  and  6 3. 
Assume  that  the  support  yields  such  that  the  actual  deflection 
angle  for  pendidum  1 is 

6\  = 61  — - (62  + 63) 


Three  plane  pendula  with  complete  linear 
coupling. 


where  the  coupling  coefficient  e is  small  and  involves  all  the  pendida,  not  just  the  nearest  neighbors.  The 
same  relation  exists  for  the  other  angle  coordinates.  The  gravitational  potential  energy  of  each  pendulum  is 
given  by 

U\  = mgb(  1 — cos^i)  « -mgbdf 

assuming  the  small  angle  approximation.  Ignoring  terms  of  order  e2  gives  that  the  potential  energy 


U = 


m.gb 

— 


{6\  + el  + el 


2e6162  - 2£f9i<93  - 256263) 


The  kinetic  energy  evaluated  at  the  equilibrium  location  is 


T = 


1 


m ( bO  2 


-m 


(bd3y 


The  next  stage  is  to  evaluate  the  {T}  and  {V}  tensors 


T = mb' 


{ 1 

0 

0 ] 

r 1 £ £ 1 

0 

1 

0 

> V = m.gb  l 

1 £ 1 £ 

l 0 

0 

1 

l -£  ^£  1 J 

The  third  stage  is  to  evaluate  the  secidar  determinant  which  can  be  written  as 

mgb 


—£ 


—£ 

—£ 


b,  ,2 


—£ 


-£  1 -fco 


b,  ,2 


0 


Expanding  and  factoring  gives 


9 


-ur  — 1 — e ) -co2  — 1 — e -co2  — 1 + 2e  ) = 0 


g 
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The  roots  are 


u 1 


01  3 


Vl  -2e 


This  case  results  in  two  degenerate  eigenfrequencies,  u>i  = ui 2 while  u> 3 is  the  lowest  eigenfrequency. 

The  eigenvectors  can  be  determined  by  substitution  of  the  eigenfrequencies  into 

n 

{Vjk  ~ ^rTjk)  Ojr  = 0 
3 

Consider  the  lowest  eigenfrequency  ui3t  i.e.  r = 3,  for  k = 1,  and  substitute  for  0)3  = ^/|VI  — 2e  gives 


2eai3  — za-23  ~ ea 33  = 0 


while  for  r = 3,  k = 2 
Solving  these  gives 


— eai3  + 2£a23  — ea33  — 0 


<2 1.3  — «23  — «33 

Assuming  that  the  eigenfunction  is  normalized 

2,2,2  1 

a13  + a23  + a33  _ 1 


then  for  the  third  eigenvector  a3 


1 


«13  = a 23  = «33  — 


V3 


This  solution  corresponds  to  all  three  pendula  oscillating  in  phase  with  the  same  amplitude,  that  is,  a coherent 
oscillation. 

Derivation  of  the  eigenfunctions  for  the  other  two  eigenfrequencies  is  complicated  because  of  the  degen- 
eracy wi  = u>2,  there  are  only  five  independent  equations  to  specify  the  six  unknowns  for  the  eigenvectors 
01  and  02-  That  is,  the  eigenvectors  can  be  chosen  freely  as  long  as  the  orthogonality  and  normalization  are 
satisfied.  For  example,  setting  <231  = 0,  to  remove  the  indeterminacy,  residts  in  the  a matrix 


( \y/2  IV6  g\/3 
{a}=i  -\y/2  \yf% 

( 0 -|>/6  §V3 

and  thus  the  solution  is  given  by 

i^2  IV6  IV3  W rh 

-W 2 iV6  W V2 

0 -|V6  |>/3  J [ Vs 

The  normal  modes  are  obtained  by  taking  the  inverse  matrix  {a}-1  and  using  {rj}  = {a}  1 {0} . Note 
that  since  {a}  is  real  and  orthogonal,  then  {a}-1  equals  the  transpose  of  {a}  . That  is; 


Vi  ) { 

0 

V2  > = < 

Vs  J 1 

. |V3  ±V3 

|V3 

The  normal  mode  p3  has  eigenfrequency 


ui  3 


- 2e 


0i 

e2 

o3 


Vs  = ^(01  »02,03) 


and  eigenvector 
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This  corresponds  to  the  in-phase  oscillation  of  all  three  pendula. 

The  other  two  degenerate  solutions  are 

Vi  = ^(01,-02,0)  rj2  = -j=  (6>i,  6>2,  — 26>3) 

with  eigenvalues 

UJ  i = U>2  = 

These  two  degenerate  normal  modes  correspond  to  two  pendida  oscillating  out  of  phase  with  the  same  ampli- 
tude, or  two  oscillating  in  phase  with  the  same  amplitude  and  the  third  out  of  phase  with  twice  the  amplitude. 
An  important  result  of  this  toy  model  is  that  the  most  symmetric  mode  ?/3  is  pushed  far  from  all  the  other 
modes.  Note  that  for  this  example,  the  coherent  mode  as  corresponds  to  the  center- of -mass  oscillation  with 
no  relative  motion  between  the  three  pendula.  This  is  in  contrast  to  the  eigenvectors  a\  and  as  which  both 
correspond  to  relative  motion  of  the  pendida  such  that  there  is  zero  center- of -mass  motion.  This  mean-field 
coupling  behavior  is  exhibited  by  collective  motion  in  nuclei  as  discussed  in  example  12.13. 


12.7  Example:  Three  plane  pendula;  nearest-neighbor  coupling 

There  is  a large  and  important  class  of  coupled  oscillators 
where  the  coupling  is  only  between  nearest  neighbors;  a crys- 
talline lattice  is  a classic  example.  A toy  model  for  such  a ~ \ \ 

system  is  the  case  of  three  identical  pendula  coupled  by  two  \ \ 

identical  springs,  where  only  the  nearest  neighbors  are  cou-  \ \ 

pled  as  shown  in  the  adjacent  figure.  Assume  the  identical 

pendula  are  of  length  b and  mass  m.  As  in  the  last  example,  °2 

the  kinetic  energy  evaluated  at  the  equilibrium  location  is  \ 

1 2*^  1 2*2  1 2*2 
T = -mb  6 1 + —mb  d2  + — mb~93  Three  plane  pendula  with  nearest-neighbour 

coupling. 

The  gravitational  potential  energy  of  each  pendidum  equals 
mgb(l  — cos  9)  s=s  | mgbO 2 thus 


Jgrav  — — mgb(91  + 92  + 03) 


while  the  potential  energy  in  the  springs  is  given  by 

V spring  = ($2  — Of)  + ($3  ~ 62) 

Thus  the  total  potential  energy  is  given  by 


Kb2  [(02  - #i)2  + (e3  - e2f  \ = b 2 [e\  + 2 + o2s-  2 exe2  - 262es\ 


U — + O2  ~\~  $3)  + -«62  [ 92  + 2$2  H-  92  — e26\62  — 202 ^3] 

The  Lagrangian  then  becomes 

L = ^ mb 2 (o\  +O2+  O3)  — \ ( mgb  + Kb2)  82  + ^ ( mgb  + 2nb2)  ^ ( mgb  + Kb2)  9 2 — nb2  {6162  + 0203) 

Using  this  in  the  Eider- Lagrange  equations  gives  the  equations  of  motion 

mb20\  — ( mgb  + nb2)8i  + nb29  2 = 0 

mb20 2 - ( mgb  + 2k62)02  + nb2  (0i  +63)  = 0 

mb29  3 — ( mgb  + Kb2)93  + k&2#2  = 0 

The  general  analytic  approach  requires  the  T and  V energy  tensors  given  by 


T = mb2 


{ 1 

0 

0 1 

( mgb  + nb2 

—Kb2 

0 1 

{ 0 

1 

0 \ 

V = { -nb2 

mgb  + 2 Kb2 

-Kb2  \ 

1 0 

0 

1 J 

\ 0 

- nb 2 

mgb  + Kb2  J 
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Note  that  in  contrast  to  the  prior  case  of  three  fully-coupled,  pendula,  for  the  nearest  neighbor  case  the  potential 
energy  tensor  {V}  is  non-zero  only  on  the  diagonal  and  ±1  components  parallel  to  the  diagonal. 

The  third  stage  is  to  evaluate  the  secular  determinant  of  the  (V  — u>2 T)  matrix,  that  is 


mgb  + nb2  — to2mb 2 
—nb2 
0 


— nb 2 

mgb  + 2 nb2  — oj2mb2 
- nb 2 


mgb 


0 

—nb2 

nb2  — to2mb2 


= 0 


This  results  in  the  characteristic  equation 

( mgb  — cv2mb2)  {mgb  + nb2  — ui2mb2)  ( mgb  + 3k b2  — ui2mb2)  = 0 

which  results  in  the  three  non- degenerate  eigenfrequencies  for  the  normal  modes. 

The  normal  modes  are  similar  to  the  prior  case  of  complete  linear 
coupling,  as  shown  in  the  adjacent  figure. 

w>i  = This  lowest  mode  rj1  involves  the  three  pendula  oscillating 
in  phase  such  that  the  springs  are  not  stretched  or  compressed  thus  the 
period  of  this  coherent  oscillation  is  the  same  as  an  independent  pendulum 
of  mass  m and  length  b.  That  is 

Vl  = ^=(01,02,03) 

w>2  = y/b  + m-  This  second  mode  g2  has  the  central  mass  stationary  with 
the  outer  pendula  oscillating  with  the  same  amplitude  and  out  of  phase. 

That  is 

rl2  = ^(01,0,-03) 


03  = 01 


w3  = \/!  + m • This  third  mode  r)3  involves  the  outer  pendula  in  phase 
with  the  same  amplitude  while  the  central  pendulum  oscillating  with  angle 
03  = —20i.  That  is 

(01)  —202, 03) 


Vs  = 


1 

7T 


01  02 =0  03  = -01 


11, 


Similar  to  the  prior  case  of  three  completely-coupled  pendula,  the  coherent 
normal  mode  rj1  corresponds  to  an  oscillation  of  the  center- of-mass  with 
no  relative  motion,  while  r}2  and  r]3  correspond  to  relative  motion  of 
the  pendula  with  stationary  center  of  mass  motion.  In  contrast  to  the 
prior  example  of  complete  coupling,  for  nearest  neighbor  coupling  the  two 
higher  lying  solutions  are  not  degenerate.  That  is,  the  nearest  neighbor 
coupling  solutions  differ  from  when  all  masses  are  linearly  coupled. 

It  is  interesting  to  note  that  this  example  combines  two  coupling  mech- 
anisms that  can  be  used  to  predict  the  solutions  for  two  extreme  cases 
by  switching  off  one  of  these  coupling  mechanisms.  Switching  off  the 
coupling  springs,  by  setting  k = 0,  makes  all  three  normal  frequencies 
degenerate  with  w>\  = u)2  = u>3  = This  corresponds  to  three  inde- 

pendent identical  pendula  each  with  frequency  u>  = Also  the  three 
linear  combinations  rj1,rj2,ri3  also  have  this  same  frequency,  in  particular 
r]1  corresponds  to  an  in-phase  oscillation  of  the  three  pendula.  The  three 
uncoupled  pendula  are  independent  and  any  combination  the  three  modes  is  allowed  since  the  three  frequencies 
are  degenerate. 

The  other  extreme  is  to  let  f = 0,  that  is  switch  off  the  gravitational  field  or  let  b — > oo,  then  the  only 
coupling  is  due  to  the  two  springs.  This  results  in  cv i = 0 because  there  is  no  restoring  force  acting  on  the 
coherent  motion  of  the  three  in-phase  coupled  oscillators;  as  a result,  oscillatory  motion  cannot  be  sustained 
since  it  corresponds  to  the  center  of  mass  oscillation  with  no  external  forces  acting  which  is  spurious.  That 
is,  this  spurious  solution  corresponds  to  constant  linear  translation. 


01  02=  -20i  03=01 

Normal  modes  of  three  plane 
pendula  with  nearest-neighbour 
coupling. 
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12.8  Example:  System  of  three  bodies  coupled  by  six  springs 

Consider  the  completely-coupled  mechanical  system  shown  in  the  adjacent  figure. 

1)  The  first  stage  is  to  determine  the  potential  and  kinetic  energies  using  an  appropriate  set  of 
generalized  coordinates,  which  here  are  x\  and  x<i-  The  potential  energy  is  the  sum  of  the  potential  energies 
for  each  of  the  six  springs 


TT  3 2 3 2 3 2 

U = —KXi  + -nx2  + -kx3  ~ Kxixz  ~ SX1X3  — KX2X3 


while  the  kinetic  energy  is  given  by 

T = -rnx?  H — mx2  H — mxi 
2 1 2 2 2 d 

2)  The  second  stage  is  to  evaluate  the  potential  energy  V and 

kinetic  energy  T tensors. 


j 3k  — K — K 'j 

{ M 

0 

0 1 

V = l 

— K 3k  —k 

[ T = \ 

0 

M 

0 

[ — K — K 3k  J 

l 0 

0 

M ) 

Note  that  for  this  case  the  kinetic  energy  tensor  is  diagonal  whereas 
the  potential  energy  tensor  is  nondiagonal  and  corresponds  to  com- 
plete coupling  of  the  three  coordinates. 

3)  The  third  stage  is  to  use  the  potential  V and  kinetic  T 
energy  tensors  to  evaluate  the  secular  determinant  giving 

(3  k — mu2)  —k  — k 

— K (3  K — mu2)  — k = 0 

— k — k (3k  — mu2) 

The  expansion  of  this  secular  determinant  yields 

(k  — mu2)  (4 k — mu2)  (4k  — mMu2)  = 0 

The  solution  for  this  complete- coupled  system  has  two  degenerate  eigenvalues. 


System  of  three  bodies  coupled  by  six 
springs. 


4 ) The  fourth  step  is  to  insert,  these  eigenfrequencies  into  the  secular  equation 


T,  (ijfc  ajr  — 0 

j 


to  determine  the  coefficients  ajr. 

5)  The  final  stage  is  to  write  the  general  coordinates  in  terms  of  the  normal  coordinates. 

The  result  is  that  the  angidar  frequency  U3  = corresponds  to  a normal  mode  for  which  the  three 
masses  oscillate  in  phase  corresponding  to  a center- of-mass  oscillation  with  no  relative  motion  of  the  masses. 


V3  = -j=  Ol  +x2+  x3) 


For  this  coherent  motion  only  one  spring  per  mass  is  stretched  resulting  in  the  same  frequency  as  one 
mass  on  a spring.  The  other  two  solutions  correspond  to  the  three  masses  oscillating  out  of  phase  which 
implies  all  three  springs  are  stretched  and  thus  the  angular  frequency  is  higher.  Since  the  two  eigenvalues 
u\  = U2  = 2y^  are  degenerate  then  there  are  only  five  independent  equations  to  specify  the  six  unknowns 
for  the  degenerate  eigenvalues.  Thus  it  is  possible  to  select  combination  of  the  eigenvectors  r]1  and  p2  such 
that  the  combination  is  orthogonal  to  r/3.  Choose  031  = 0 to  removes  the  indeterminacy.  Then  adding  or 
subtracting  gives  that  the  normal  modes  are 


771  = ^ X2  + 


V2  = (X1  + x2  ~ 2X3) 


These  two  degenerate  normal  modes  correspond  to  relative  motion  of  the  masses  with  stationary  center- of- 


mass. 
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12.9  Molecular  coupled  oscillator  systems 

There  are  many  examples  of  coupled  oscillations  in  atomic  and  molecular  physics  most  of  which  involve 
nearest-neighbor  coupling.  The  following  two  examples  are  for  molecular  coupled  oscillators.  The  triatomic 
molecule  is  a typical  linearly-coupled  molecular  oscillator.  The  benzene  molecule  is  an  elementary  example 
of  a ring  structure  coupled  oscillator. 

12.9  Example:  Linear  triatomic  molecular  CO 2 

Molecules  provide  excellent  examples  of  vibrational  modes  involving  nearest  neighbor  coupling.  Depending 
on  the  atomic  structure,  triatomic  molecules  can  be  either  linear,  like  CO-2,  or  bent  like  water,  H-20  which 
has  a bend  angle  of  9 = 109°.  A molecule  with  n atoms  has  3 n degrees  of  freedom.  There  are  three  degrees 
of  freedom  for  translation  and  three  degrees  of  freedom  for  rotation  leaving  3n  — 6 degrees  of  freedom  for 
vibrations.  A triatomic  molecule  has  three  vibrational  modes,  two  longitudinal  and  one  transverse.  Consider 
the  normal  modes  for  vibration  of  the  linear  molecule  C02 


Longitudinal  modes 

The  coordinate  system  used  is  illustrated  in  the  adjacent  figure. 

The  Lagrangian  for  this  system  is 

r ( m .9  M m ,0\  k , . .0  , . 2n 

L = ( ~2Xl  + ~X2  + YX3J  ~ 2 X2  “ Xl^  + X2)  1 

Evaluating  the  kinetic  energy  tensor  gives 

f to  0 O'! 

T = < 0 M 0 > 

[0  0 TO  J 


while  the  potential  energy  tensor  gives 


The  secular  eqiLation  becomes 

(—tow2  + ft)  —ft  0 

—ft  (— Mw2  + 2ft)  —ft  = 0 

0 -ft  (—tow2  -I-  ft) 

Note  that  the  same  answer  is  obtained  using  Newtonian  mechanics.  That  is,  the  force  equation  gives 

mx  i — k(x2  — xi)  = 0 

Mx2  + k{x2  - xi)  - ft  {xz  - x2)  = 0 

mx 3 — ft  (xz  — x2)  = 0 


Let  the  solution  be  of  the  form 
Substitute  this  solution  gives 


Xj  = ajelut  j = 1, 2, 3 

(—tow2  + ft)  a\  — na2 
— Kdi  + (— Mix'2  + 2 ft)  a2  — fta 3 
— na2  + (tow2  + k)  as 


0 

0 

0 


This  leads  to  the  same  secidar  determinant  as  given  above  with  the  matrix  elements  clustered  along  the 
diagonal  for  nearest-neighbor  problems. 
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Expanding  the  determinant  and  collecting  terms 
yields 

to2  (— mw2  + k)  (— mMw 2 + kM  + 2 nm)  = 0 
Equating  each  of  three  factors  to  zero  gives 


4,  O—  O-'  O — 

— O O O — 
n3  — o 

The  solutions  are: 

1)  (jJ\  = 0;  This  solution  gives  rj1  = a{l,  1, 1}.  This 
mode  is  not  an  oscillation  at  all,  but  is  a pure  transla-  ' 4 
tion  of  the  system  as  a ivhole  as  shown  in  the  adjacent 
figure.  There  is  no  change  in  the  restoring  forces  since 
the  system  moves  such  as  not  to  change  the  length  of  the 
springs,  that  is,  they  stay  in  their  equilibrium  positions. 

This  motion  corresponds  to  a spurious  oscillation  of  the  center  of  mass  that  results  from  referencing  the 
three  atom  locations  with  respect  to  some  fixed  reference  point.  This  reference  point  should  have  been  chosen 
as  the  center  of  mass  since  the  motion  of  the  center- of -mass  already  has  been  taken  into  account  separately. 
Spurious  center  of  mass  oscillations  occur  any  time  that  the  reference  point  is  not  at  the  center  of  mass  for 
an  isolated  system  with  no  external  forces  acting. 

2)  w2  = \ ■ This  solution  corresponds  to  g2  — a {1,0,  — 1}  and  is  shown  in  the  adjacent  figure.  The 

central  mass  M remains  stationary  while  the  two  end  masses  vibrate  longitudinally  in  opposite  directions 
with  the  same  amplitude.  This  mode  has  a stationary  center  of  mass.  For  CO-2  the  electrical  geometry  is 
0~C++0~ . Mode  2 for  CO2  does  not  radiate  electromagnetically  because  the  center  of  charge  is  stationary 
with  respect  to  the  center  of  mass,  that  is,  the  electric  dipole  moment  is  constant. 

3)  to 3 = : This  solution  corresponds  to  p3  = a{  1,-2  (^)  , 1}  . As  shown  in  the  adjacent 

figure,  this  motion  corresponds  to  the  two  end  masses  vibrating  in  unison  while  the  central  mass  vibrates 
oppositely  with  a different  amplitude  such  that  the  center-of-mass  is  stationary.  This  CO2  mode  does  radiate 
electromagnetically  since  it  corresponds  to  an  oscillating  electric  dipole. 

It  is  interesting  to  note  that  the  ratio  ^ = 1.915  for  CO2  and  the  ratio  of  the  two  modes  is  independent 
of  the  potential  energy  tensor  V.  That  is 

iv  3 
0J2 

Transverse  modes 

The  solutions  are: 

4)  u>  4 = This  is  the  only  non-spurious  transverse  mode  which  corresponds  to  the  two 

outside  masses  vibrating  in  unison  transverse  to  the  symmetry  axis  while  the  central  mass  vibrates  oppositely. 
This  mode  radiates  electric  dipole  radiation  since  the  electric  dipole  is  oscillating. 

5)  W5  = 0.  This  transverse  solution  ??5  has  all  three  nuclei  vibrating  in  unison  transverse  to  the  symmetry 
axis  and  corresponds  to  a spurious  center  of  mass  oscillation. 

6)  ujq  = 0.  This  transverse  solution  r/6  corresponds  to  a stationary  central  mass  with  the  two  outside 
masses  vibrating  oppositely.  This  corresponds  to  a rotational  oscillation  of  the  molecule  which  is  spurious 
since  there  are  no  torques  acting  on  the  molecule  for  a central  force.  Rotational  motion  usually  is  taken  into 
account  separately. 

The  normal  modes  for  the  bent  triatomic  molecule  are  similar  except  that  the  oscillator  coupling  strength 
is  reduced  by  the  factor  cos  9 where  9 is  the  bend  angle. 


I 

Normal  modes  of  a linear  triatomic  molecule 
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12.10  Example:  Benzene  ring 

The  benzene  ring  comprises  six  carbon  atoms  bound  in  a plane  hexagonal  ring.  A classical  analog  of  the 
benzene  ring  comprises  6 identical  masses  m on  a frictionless  ring  bound  by  6 identical  springs  with  linear 
spring  constant  K,  as  illustrated  in  the  adjacent  figure.  Consider  only  the  in-plane  motion,  then  the  kinetic 
energy  is  given  by 

T = I mr2  (h 

i=  1 

The  potential  energy  equals 

1 6 r 6 

U = Kr2^(6i+1  - erf  = Kr 2 £ 92  - 9,92  - 8283  - 9384  - 9465  - 6,9,  - 9,9, 

i=  1 _i= 1 

where  i = 7=1.  Thus  the  kinetic  energy  and  potential  energy  tensors  are  given  by 


T = mr2 


f 1 

0 

0 

0 

0 

0 1 

f 2 

-1 

0 

0 

0 

-1  \ 

0 

1 

0 

0 

0 

0 

-1 

2 

-1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

U = Kr2 

0 

-1 

2 

-1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

-1 

2 

-1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

-1 

2 

-1 

V 0 

0 

0 

0 

0 

1 

l -1 

0 

0 

0 

-1 

2 / 

This  nearest-neighbor  system  includes  non-zero  (n,  1)  and  (1,  n)  elements  due  to  the  ring  structure.  Define 
2 

x = ^ 2 then  the  solution  of  the  set  of  linear  homogeneous  equations  requires  that 

a:  1 0 0 0 1 

1x10  0 0 

0 1x10  0 

0 0 1 x 1 0 

0 0 0 1 x1 

1 0 0 0 1 x 

that  is 

(x  — 2)  (x  — l)2  (x  + l)2  (x  + 2)  = 0 
The  eigenvalues  and  eigenfunctions  are  given  in  the  table 


K 


Note  the  following  properties  of  the  normal  modes  and  their  frequencies. 

n = 1:  Adjacent  masses  vibrate  180°  out  of  phase,  thus  each  spring  has  maximal  compression  or  extension, 
leading  to  the  energy  of  this  normal  mode  being  the  highest. 

n = 2,3:  These  two  solutions  are  degenerate  and  correspond  to  two  pairs  of  masses  vibrating  out  of  phase 
while  the  third  pair  of  masses  are  stationary.  Thus  the  energy  of  this  normal  mode  is  slightly  lower  than  the 
n = 1 normal  mode.  Any  combination  of  these  degenerate  normal  modes  are  equally  good  solutions. 

n = 4, 5:  From  the  figure  it  can  be  seen  that  both  of  these  solutions  correspond  to  a center  of  mass 
oscillation  and  thus  these  modes  are  spurious. 

n = 6:  This  vibrational  mode  has  zero  energy  corresponding  to  zero  restoring  force  and  all  six  masses 
moving  uniformly  in  the  same  direction.  This  mode  corresponds  to  the  rotation  of  the  benzene  molecule  about 
the  symmetry  axis  of  the  ring  which  usually  is  taken  into  account  assuming  a separate  rotational  component. 

This  classical  analog  of  the  benzene  molecule  is  interesting  because  it  simultaneously  exhibits  degenerate 
normal  modes,  spurious  center  of  mass  oscillation,  and  a rotational  mode. 
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12.10  Discrete  Lattice  Chain 

Crystalline  lattices  and  linear  molecules  are  important  classes  of  coupled  oscillator  systems  where  nearest 
neighbor  interactions  dominate.  A crystalline  lattice  comprises  thousands  of  coupled  oscillators  in  a three- 
dimensional  matrix  with  atomic  spacing  of  a few  10~10m.  Even  though  a full  description  of  the  dynamics  of 
crystalline  lattices  demands  a quantal  treatment,  a classical  treatment  is  of  interest  since  classical  mechanics 
underlies  many  features  of  the  motion  of  atoms  in  a crystalline  lattice.  The  linear  discrete  lattice  chain  is 
the  simplest  example  of  many-body  coupled  oscillator  systems  that  can  illuminate  the  physics  underlying  a 
range  of  interesting  phenomena  in  solid-state  physics.  As  illustrated  in  example  2.7,  the  linear  approxima- 
tion usually  is  applicable  for  small-amplitude  displacements  of  nearest-neighbor  interacting  systems  which 
greatly  simplifies  treatment  of  the  lattice  chain.  The  linear  discrete  lattice  chain  involves  three  independent 
polarization  modes,  one  longitudinal  mode,  plus  two  perpendicular  transverse  modes.  The  3n  degrees  of 
freedom  for  the  n atoms,  on  a discrete  linear  lattice  chain,  are  partitioned  with  n degrees  of  freedom  for  each 
of  the  three  polarization  modes.  These  three  polarization  modes  each  have  n normal  modes,  or  n travelling 
waves,  quantization,  dispersion,  and  can  have  a complex  wave  number. 

12.10.1  Longitudinal  motion 

The  equations  of  motion  for  longitudinal  modes  of  the  lattice  chain  can  be  derived  by  considering  a linear 
chain  of  n identical  masses,  of  mass  m,  separated  by  a uniform  spacing  d as  shown  in  Fig  12.7.  Assume 
that  the  n masses  are  coupled  by  n + 1 springs,  with  spring  constant  k,  where  both  ends  of  the  chain  are 
fixed,  that  is,  the  displacements  qo  = qn+ i = 0 and  velocities  % = qn+ 1 = 0.  The  force  required  to  stretch  a 
length  d of  the  chain  a longitudinal  displacements,  qj  for  mass  j,  is  Fj  = nqj.  Thus  the  potential  energy  for 
stretching  the  spring  for  segment  ( qj-i  — qj ) is  Uj  = ^{qj- 1 — qj)-  The  total  potential  and  kinetic  energies 
are 


n+1 

u = \ E fe-i  - (12-74) 

j= i 

1 n 

T=2TOX^i  (12.75) 

j= i 


Since  qn+ 1 = 0 the  kinetic  energy  and  Lagrangian  can  be 
extended  to  j = n + 1 , that  is,  the  Lagrangian  can  be  written 
as 

n+l 

L=  oYl  (m3i  - K fe-i  - 9j)1 2)  (12.76) 

j= i 

Using  this  Lagrangian  in  the  Lagrange-Euler  equations 
gives  the  following  second-order  equation  of  motion  for  lon- 
gitudinal oscillations 


nnigpoo^jOOOQQlpQQQQQfQ^jQQQQtilflQ. 


«i-2 
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qJ 


r i 


V 


1 


1 


Qj  ~ wo  {Qj-i  - 2 Qj  + qj+i) 
where  j = 1,  2,  ....n  and  where 


(12.77) 

Figure  12.7:  Portion  of  a lattice  chain  of  iden- 
tical masses  m connected  by  identical  springs 
of  spring  constant  n.  The  displacement  of  the 

(12.78)  3th  mass  from  the  equilibrium  position  is  qj 
assumed  to  be  positive  to  the  right. 


12.10.2  Transverse  motion 

The  equations  of  motion  for  transverse  motion  on  a linear  discrete  lattice  chain,  illustrated  in  figure  12.8, 
can  be  derived  by  considering  the  displacements  qj  of  the  ith  mass  for  n identical  masses,  with  mass  m, 
separated  by  equal  spacings  d and  assuming  that  the  tension  in  the  string  is  T = (fy)-  Assuming  that  the 
transverse  deflections  qj  are  small,  then  the  j — 1 to  j spring  is  stretched  to  a length 


d'  = \Jd?  + (qj  - qj-!)2 


(12.79) 
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Thus  the  incremental  stretching  is 


Sd 


{Qj  - Qj- 1)2 
2d 


The  work  done  against  the  tension  r is  r • Sd  per  segment.  Thus  the 
total  potential  energy  is 


n+1 

j= 1 

where  qo  and  qn+i  are  identically  zero. 
The  kinetic  energy  is 


j=i 


(12.81) 


(12.82) 


Since  </n+i  = 0,  the  kinetic  energy  and  Lagrangian  summations  can 
be  extended  to  j = n+1,  that  is 


(12.80) 


n+1 

L = 2 E (m«i  - 2 fe-1  ^ ?j)2) 

1=1 


(12.83) 


Figure  12.8:  Transverse  motion  of  a 
linear  discrete  lattice  chain 


Using  this  Lagrangian  in  the  Lagrange  Euler  equations  gives  the  following  second-order  equation  of  motion 
for  transverse  oscillations 


Qj  = {Qj- 1 - 2 Qj  + Qj+ 1) 


(12.84) 


where  j = 1,  2,  ....n  and 


(12.85) 


The  normal  modes  for  the  transverse  modes  comprise  standing  waves  that  satisfy  the  same  boundary 
conditions  as  for  the  longitudinal  modes.  The  n equations  of  motion  for  longitudinal  motion,  equation 
12.77,  or  transverse  motion,  equation  12.84,  are  identical  in  form.  The  major  difference  is  that  i+o  for  the 
transverse  normal  modes  u>0  = differs  from  that  for  the  longitudinal  modes  which  is  ui0  = Thus 
the  following  discussion  of  the  normal  modes  on  a discrete  lattice  chain  is  identical  in  form  for  both  transverse 
and  longitudinal  waves. 


12.10.3  Normal  modes 

The  normal  modes  of  the  n equations  of  motion  on  the  discrete  lattice  chain,  are  either  longitudinal  or 
transverse  standing  waves  that  satisfy  the  boundary  conditions  at  the  extreme  ends  of  the  lattice  chain. 
The  solutions  can  be  given  by  assuming  that  the  n identical  masses  on  the  chain  oscillate  with  a common 
frequency  ui.  Then  the  displacement  amplitude  for  the  jth  mass  can  be  written  in  the  form 


qj(t)  = ajelult  (12.86) 

where  the  amplitude  cij  can  be  complex.  Substitution  into  the  preceding  n equations  of  motion,  12.77, 12.84, 
yields  the  following  recursion  relation 

( — u>'2  + 2 w2)  a,j  — ojq  (<Zj_i  + Oj-|_i)  = 0 (12.87) 


where  j = 1,  2,  ...n.  Note  that  the  boundary  conditions,  qo  = 0 and  qn+i  = 0 require  that  a0  = an+ 1 = 0. 

The  above  recursion  relation  corresponds  to  a system  of  n homogeneous  algebraic  equations  with  n 
unknowns  Oi,  «2,  ...an.  A non-trivial  solution  is  given  by  setting  the  determinant  of  its  coefficients  equal  to 


zero 


-u)2  + 2u2  -£ u 2 0 

-u2a  -u2  + 2u2q  -uj20 
0 -LJ2  + 2u)2d 


0 

0 


0 


= 0 


(12.88) 


0 


to2  + 2 LO2. 
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This  secular  determinant  corresponds  to  the  special  case  of  nearest  neighbor  interactions  with  the  kinetic 
energy  tensor  T being  diagonal  and  the  potential  energy  tensor  V involving  coupling  only  to  adjacent 
masses.  The  secular  determinant  is  of  order  n and  thus  determines  exactly  n eigen  frequencies  uir  for  each 
polarization  mode. 

For  large  n,  the  solution  of  this  problem  is  more  efficiently  obtained  by  using  a recursion  relation  approach, 
rather  than  solving  the  above  secular  determinant.  The  trick  is  to  assume  that  the  phase  differences  cf)r 
between  the  motion  of  adjacent  masses  all  are  identical  for  a given  polarization.  Then  the  amplitude  for  the 
jth  mass  for  the  rth  frequency  mode  u)r  is  of  the  form 

ajr  = arei{j^-5r)  (12.89) 

Insert  the  above  into  the  recursion  relation  (12.87)  gives 

(-w2  + 2 uj20)  - u20  [e-^  + e^r]  = o (12.90) 

which  reduces  to 

u2  = 2 to2  — 2u2  cos  (f)r  = 4 u>2  sin2  <^~ 

that  is 

ujr  = 2 to0  sin  (12.91) 

where  r = 1, 2, 3,  ....n. 

Now  it  is  necessary  to  determine  the  phase  angle  (f)r  which  can  be  done  by  applying  the  boundary 
conditions  for  standing  waves  on  the  lattice  chain.  These  boundary  conditions  for  stationary  modes  require 
that  the  ends  of  the  lattice  chain  are  nodes,  that  is  «0j.r  = a/n+ i))T.  = 0.  Using  the  fact  that  only  the  real 
part  of  ajr  has  physical  meaning,  leads  to  the  amplitude  for  the  jth  mass  for  the  rth  mode  to  be 


ayr  = ar  cos  ( j<fir  — 5r) 

The  boundary  condition  ao  = 0 requires  that  the  phase  5r  = That  is 

ajr  = ar  cos  (j<pr  - = ar  sin  j<j>r 

where  r = 1, 2, ...,  n. 

The  boundary  condition  for  j = n + 1,  gives 


Therefore 

where  r = 1,  2,  3, ...,  n.  That  is 


a(n+i)r  = 0 = ar  sin  (n  + 1)  <\>r 


(■ n + l)(j>r  = r7r 


r7T  n id  rird  krd 

^ = n + l = (n  + 1)  d = ~D=^T 

where  D = (n  + 1 )d  is  the  total  length  of  the  discrete  lattice  chain. 

The  n eigen  frequencies  for  a given  polarization  are  given  by 


u)r  = 2un  sin  ■ 


2 (n  + 1) 

where  the  corresponding  wavenumber  kr  is  given  by 


„ . rnd  n . rnd  „ . krd 

= 2 u)Q  sin  — — — — = 2 u)0  sm  — - = 2 sin  — 
2 yn,  1)  d 2D  2 


kr  = 


rn 


rn  27 r 


(n  + 1)  d D Xr 


This  implies  that  the  normal  modes  are  quantized  with  half- wavelengths 


(12.92) 

(12.93) 

(12.94) 

(12.95) 

(12.96) 

(12.97) 

(12.98) 
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r = 5 


r = 6 


Figure  12.9:  Plots  of  the  maximal  vibrational  amplitudes  ar  for  the  rth  frequency  sinusoidal  mode,  versus 
distance  along  the  chain,  for  transverse  normal  modes  of  a vibrating  discrete  lattice  with  n = 5.  Only  r = 
1, 2, 3, 4, 5,  are  distinct  modes  because  r = 6 is  a null  mode.  Note  that  the  modes  with  r = 7, 8, 9, 10, 11, 12, 
shown  dashed,  duplicate  the  locations  of  the  mass  displacement  given  by  the  lower-order  modes. 


Combining  equations  12.96  and  12.93  gives  the  maximum  amplitudes  for  the  eigenvectors  to  be 

k d 

a,jr  = arsinjC—  (12.99) 

For  n independent  linear  oscillators  there  are  only  n independent  normal  modes,  that  is,  for  r = n + 1 the 
sine  function  in  equation  12.97  must  be  zero.  Beyond  r = n the  equations  do  not  describe  physically  new 
situations.  This  is  illustrated  by  figure  12.9  which  shows  the  transverse  modes  of  a lattice  chain  with  n = 5. 
There  are  only  n = 5 independent  normal  modes  of  this  system  since  r = n + 1 = 6 corresponds  to  a null 
mode  with  all  Qj(t)  = 0.  Also  note  that  the  solutions  for  r > n + 1,  shown  dashed,  replicate  the  mass 
locations  of  modes  with  r < n + 1,  that  is,  the  modes  with  r > 6 are  replicas  of  the  lower-order  modes. 

Note  that  ojr  has  a maximum  value  lot  < 2ujo  since  the  sine  function  cannot  exceed  unity.  This  leads 
to  a maximum  frequency  ujc  = 2u>o,  called  the  cut-off  frequency,  which  occurs  when  krd  = ir.  That  is,  the 
null-mode  occurs  when  r = n + 1 for  which  equation  12.99  equals  zero.  The  range  of  n quantized  normal 
modes  that  can  occur  is  intuitive.  That  is,  the  longest  half-wavelength  ^ = D = (n+  1 )d  equals  the  total 
length  of  the  discrete  lattice  chain.  The  shortest  half-wavelength  -£a±=°Il  = d is  set  by  the  lattice  spacing. 
Thus  the  discrete  wavenumbers  of  the  normal  modes,  for  each  polarization,  range  from  k\  to  nk\  where  n is 
an  integer. 

Assuming  real  kr . the  normal  coordinate  T]r  and  corresponding  frequency  uir  are, 

rjr  = areiuA  (12.100) 

Equations  12.97  and  12.99  give  the  angular  frequency  and  displacement.  Note  that  superposition  applies 
since  this  system  is  linear.  Therefore  the  most  general  solution  for  each  polarization  can  be  any  superposition 
of  the  form 

n 

Qj(t)  = XXsin 

r=  1 


mj 

.("  + 1). 


(12.101) 


370 


CHAPTER  12.  COUPLED  LINEAR  OSCILLATORS 


12.10.4  Travelling  waves 

Travelling  waves  are  equally  good  solutions  of  the  equations  of  motion  12.77, 12.84  as  are  the  normal  modes. 
Travelling  waves  on  the  one-dimensional  lattice  chain  will  be  of  the  form 

q(x,  t)  = Cei{ut±kx)  (12.102) 


where  the  distance  along  the  chain  x = ud,  that  is,  it  is  quantized  in  units  of  the  cell  spacing  d , with  v being 
an  integer.  The  positive  sign  in  the  exponent  corresponds  to  a wave  travelling  in  the  — x direction  while 
the  negative  sign  corresponds  to  a wave  travelling  in  the  +x  direction.  The  velocity  of  a fixed  phase  of  the 
travelling  wave  must  satisfy  that  cot  ± kx  is  a constant.  This  will  occur  if  the  phase  velocity  of  the  wave  is 
given  by 


vphase  _ f 

dt  k 


(12.103) 


The  wave  has  a frequency  f = f~  and  wavelength  A = thus  the  phase  velocity  vphaSe  = f = A/. 

Inserting  the  travelling  wave  12.102  into  the  transverse  equation  of  motion  12.84  for  the  discrete  lattice 
chain  gives 


~w2qr  = c^(e“^  -2  + e^)qr  (12.104) 

where  j = 1, 2,  ....n.  That  is 

uir  = ±2w0  sin  ^ (12.105) 

The  phase  <pr  is  determined  by  the  Born- von  Karrnan  periodic  boundary  condition  that  assumes  that  the 
chain  is  duplicated  indefinitely  on  either  side  of  k = ±^.  Thus,  for  n discrete  masses,  k must  satisfy  the 
condition  that  qr  = qr+n ■ That  is 

eikrnd  = 1 (12.106) 


That  is 


kr  = 


2nr 

nd 


Note  that  the  periodic  boundary  condition  gives  n discrete  modes 
for  wavenumbers  between 


(12.107) 


where  the  index 


7 T , 7 r 

- — < < + — 

d d 


n n n ^ n 

1 > ~2  — ^ 


Thus  equation  12.105  becomes 


iOr  = ±2u>o  sin 


krd 


[«—  First  Brillouin  zone  — 


Equation  12.109  is  a dispersion  relation  that  is  identical  to  equa- 
tion 12.97  derived  during  the  discussion  of  the  normal  modes  of  the 
lattice  chain.  This  confirms  that  the  travelling  waves  on  the  lat- 
tice chain  are  equally  good  solutions  as  the  normal  standing-wave 
modes.  Clearly,  superposition  of  the  standing-wave  normal  modes 
can  lead  to  travelling  waves  and  vice  versa. 

12.10.5  Dispersion 


Figure  12.10:  Plot  of  the  dispersion 
curve  (w  versus  k)  for  a monoatomic 
linear  lattice  chain  subject  to  only 
nearest  neighbor  interactions.  The 
first  Brillouin  zone  is  the  segment  be- 
tween — ^ < k < ^ which  covers  all 
independent  solutions. 


The  lattice  chain  is  an  interesting  example  of  a dispersive  system  in  that  ojr  is  a function  of  Ay.  Figure  12.10 
shows  a plot  of  the  dispersion  curve  (w  versus  k)  for  a monoatomic  linear  lattice  chain  subject  to  only  nearest 
neighbor  interactions.  Note  that  u)  depends  linearly  on  k for  small  k and  that  ^ = 0 at  the  boundaries  of 
the  first  Brillouin  zone. 
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The  lattice  chain  has  a phase  velocity  for  the  rth  wave  given  by 


V: 


phase 


— ~r~  — uod- 


sin 


fcyd  I 


krd 

2 


while  the  group  velocity  is 


group  


doj 

dk 


= LOodcOS 


krd 


(12.110) 


(12.111) 


Note  that  in  the  limit  when  — > 0,  the  phase  velocity  and  group  velocity  are  identical,  that  is,  vphase  = 

ygro-up  = 

12.10.6  Complex  wavenumber 

The  maximum  allowed  freciuency,  which  is  called  the  cut-off  frequency,  u>c  = 2coo,  occurs  when  krd  = 7 r,  that 
is,  \ = d.  That  is,  the  minimum  half-wavelengtli  equals  the  spacing  d between  the  discrete  masses.  At  the 
cut-off  frequency,  the  phase  velocity  is  vphase  = ^u>od  and  the  group  velocity  vproup  = 0. 

It  is  interesting  to  note  that  ojr  can  exceed  the  cut-off  frequency  wc  = 2wo  if  kr  is  assumed  to  be  complex, 
that  is,  if 

kr  Kr  Zl^  r 


Then 


krd 


. d 


( jjr  = 2co0  sin  — - = 2loq  sin  - (nr  — iTr)  = 2uj0  sin  — — cosh  — i cos  — - sinh 


Krd  , Trd 


Krd  . , Trd 


To  ensure  that  u)r  is  real,  the  imaginary  term  must  be  zero,  that  is 


Therefore 


Krd 

cos  — — = 0 


. Krd 

sm— = ! 


that  is,  kr  = , and  the  dispersion  relation  between  u)  and  k for  u > 2 w0  becomes 


L0r  = 2wq  cosh 


Trd 


(12.112) 

(12.113) 

(12.114) 

(12.115) 

(12.116) 


which  increases  with  F.  Thus,  when  u>  > uic  = 2coo  then  the  amplitude  of  the  wave  is  of  the  form 

qr  (f)  = are-rr-Xei{u,rt-HrX)  (12.117) 

which  corresponds  to  a spatially  damped  oscillatory  wave  with  phase  velocity 

(12.118) 


phase  


and  damping  factor  Fr. 

There  are  many  examples  in  physics  where  the  wavenumber  is  complex  as  exhibited  by  the  discrete  lattice 
chain  for  ^ < d.  Other  examples  are  electromagnetic  waves  in  conductors  or  plasma  (example  3.5),  matter 
waves  tunnelling  through  a potential  barrier,  or  standing  waves  on  musical  instruments  which  have  a complex 
wavenumber  k due  to  damping. 

This  simple  toy  model  of  the  discrete  linear  lattice  chain  has  illustrated  that  classical  mechanics  explains 
many  features  of  the  many-body  nearest-neighbor  coupled  linear  oscillator  system,  including  normal  modes, 
standing  and  travelling  waves,  cut-off  frequency  dispersion,  and  complex  wavenumber.  These  phenomena 
feature  prominently  in  applications  of  the  quantal  discrete  coupled-oscillator  system  to  solid-state  physics. 


372 


CHAPTER  12.  COUPLED  LINEAR  OSCILLATORS 


12.11  Damped  coupled  linear  oscillators 


The  discussion  of  coupled  linear  oscillators  has  neglected  non-conservative  damping  forces  which  always  exist 
to  some  extent  in  physical  systems.  In  general,  dissipative  forces  are  non  linear  which  greatly  complicates 
solving  the  equations  of  motion  for  such  coupled  oscillator  systems.  However,  for  some  systems  the  dissipative 
forces  depend  linearly  on  velocity  which  allows  use  of  the  Rayleigh  dissipation  function,  described  in  chapter 
8.7.2.  The  most  general  definition  of  the  Rayleigh  dissipation  function,  8.72,  was  given  to  be 

^ n n 

T = — Cijqi.qj  (12.119) 

*=  i i= i 


For  this  special  case,  it  was  shown  in  chapter  8 that  the  Lagrange  equations  can  be  written  in  terms  of  the 
Rayleigh  dissipation  function  as 


8L_ 

dq3 


dL_ 

dq3 


d. F 

dq3 


= Qj 


(12.120) 


where  Qj  are  generalized  forces  acting  on  the  system  that  are  not  absorbed  into  the  potential  U.  LTsing 
equations  12.43, 12.44,  and  12.120,  allows  the  equations  of  motion  for  damped  coupled  linear  oscillators  to 
be  written  in  a matrix  form  as 


{T}q+{C}q+{V}q  = {Q} 


(12.121) 


where  the  symmetric  matrices  {T}  , {C}  , and  {V}  are  positive  definite  for  positive  definite  systems.  Rayleigh 
pointed  out  that  in  the  special  case  where  the  damping  matrix  {C}  is  a linear  combination  of  the  {T}  and 
{V}  matrices,  then  the  matrix  {C}  is  diagonal  leading  to  a separation  of  the  damped  system  into  normal 
modes.  As  discussed  in  chapter  4 many  systems  in  nature  are  linear  for  small  amplitude  oscillations  allowing 
use  of  the  Rayleigh  dissipation  function  which  provides  an  analytic  solution.  However,  in  general,  except  for 
when  {C}  is  small,  this  separation  into  normal  modes  is  not  possible  for  damped  systems  and  the  solutions 
must  be  obtained  numerically. 

The  following  two  examples  illustrate  approaches  used  to  handle  linearly-damped  coupled-oscillator  sys- 
tems. 


12.11  Example:  Two  linearly- damped  coupled  linear  oscillators 


Consider  the  two  coupled  oscillator  system  shown 
where  the  two  carts  have  spring  constants  k\ , k2  and 
linear  damping  constants  C1C2.  As  discussed  in  exam- 
ple 12.3,  the  kinetic  energy  tensor  is  given  by 

T = i miql  + ^ m2qi  (a) 


and  the  potential  energy  is  given  by 


U 


k\ql  + k2  (q2  — <Zi)" 


1 

2 L 

^ [(fci  + k2)  q\  - 2/c2<7i<?2  + k2qf\ 


(b) 


kl 

^ k2 

n 

m1 

"A- 

m2 

Cl 

C2 

Two  linearly-damped  coupled  linear  oscillators. 


Similarly  the  Rayleigh  dissipation  function  has  the  form 

T=\  [ci9i  + c2  (<?2  - Qi)\  = \ [(ci  + C2)  q\  - 2c2qiq2  + c2ql\  (c) 

Inserting  a,b , and  c into  equation  12.120  gives  the  two  equations  of  motion  to  be 


miqi  + (ci  + c2)  qi  - c2q2  + (£q  + k2)  qi  - k2q2  = 0 
m2q2  - c2q±  + c2q2  - k2q\  + k2q2  = 0 


When  the  drag  is  zero  the  solution  of  these  two  coupled  equations  can  be  separated  into  two  independent 
normal  modes  of  the  system  as  described  earlier.  Usually  it  is  not  possible  to  separate  the  motion  into 
decoupled  normal  modes  except  for  certain  cases  where  the  dissipative  forces  can  be  described  by  Rayleigh ’s 
dissipation  f unction. 
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12.12  Collective  synchronization  of  coupled  oscillators 

Collective  synchronization  of  coupled  oscillators  is  a multifaceted  phenomenon  where  large  ensembles  of 
coupled  oscillators,  with  comparable  natural  frequencies,  self  synchronize  leading  to  coherent  collective  modes 
of  motion.  Biological  examples  include  congregations  of  synchronously  flashing  fireflies,  crickets  that  chirp  in 
unison,  an  audience  clapping  at  the  end  of  a performance,  networks  of  pacemaker  cells  in  the  heart,  insulin- 
secreting  cells  in  the  pancreas,  and  neural  networks  in  the  brain  and  spinal  cord  that  control  rhythmic 
behaviors  such  as  breathing,  walking,  and  eating.  Example  12.13  illustrates  an  application  to  nuclei. 

An  ensemble  of  coupled  oscillators  will  have  a frequency  distribution  with  a finite  width.  It  is  interesting 
to  elucidate  how  an  ensemble  of  coupled  oscillators,  that  have  a finite  width  frequency  distribution,  can  self 
synchronize  their  motion  to  a unique  common  frequency,  and  how  that  synchronization  is  maintained  over 
long  time  periods.  The  answers  to  these  issues  provide  insight  into  the  dynamics  of  coupled  oscillators. 

The  discussion  of  coupled  oscillators  has  implicitly  assumed  n identical  undamped  linear  oscillators  that 
have  identical,  infinitely-sharp,  natural  frequencies  u>i.  In  nature  typical  coupled  oscillators  can  have  a finite- 
width  frequency  distribution  g(ui)  about  some  average  value,  due  to  the  natural  variability  of  the  oscillator 
parameters  for  biological  systems,  the  manufacturing  tolerances  for  mechanical  oscillators,  or  the  natural 
Lorentzian  frequency  distribution  associated  with  the  uncertainty  principle  that  occurs  even  for  atomic  clocks 
where  the  oscillator  frequencies  are  defined  directly  by  the  physical  constants.  Assume  that  the  ensemble  of 
coupled  oscillators  has  a frequency  distribution  g{uj)  about  some  average  value. 

Undamped  linear  oscillators  have  elliptical  closed-path  trajectories  in  phase  space  whereas  dissipation 
leads  to  a spiral  attractor  unless  the  system  is  driven  such  as  to  preserve  the  total  energy.  As  described 
in  chapter  4.4  many  systems  in  nature,  especially  biological  systems,  have  closed  limit  cycles  in  phase 
space  where  the  energy  lost  to  dissipation  is  replenished  by  a driving  mechanism.  The  simplest  systems  for 
understanding  collective  synchronization  of  coupled  oscillators  are  those  that  involve  closed  limit  cycles  in 
phase  space. 

N.  Wiener  first  recognize  the  ubiquity  of  collective  synchronization  in  the  natural  world,  but  his  mathe- 
matical approach,  based  on  Fourier  integrals,  was  not  suited  to  this  problem.  A more  fruitful  approach  was 
pioneered  in  1975  by  an  undergraduate  student  A.T.  Winfree[Win67]  who  recognized  that  the  long-time  be- 
havior of  a large  ensemble  of  limit-cycle  oscillators  can  be  characterized  in  the  simplest  terms  by  considering 
only  the  phase  of  closed  phase-space  trajectories.  He  assumed  that  the  instantaneous  state  of  an  ensemble 
of  oscillators  can  be  represented  by  points  distributed  around  the  circular  phase-space  diagram  shown  in 
figure  12.11.  For  uncoupled  oscillators  these  points  will  be  distributed  randomly  around  the  circle,  whereas 
coupling  of  the  oscillators  will  result  in  a spatial  correlation  of  the  points.  That  is,  the  dynamics  of  the 
phases  can  be  visualized  as  a swarm  of  points  running  around  the  unit  circle  in  the  complex  plane  of  the 
phase  space  diagram.  The  complex  order  parameter  of  this  swarm  can  be  defined  to  be  the  magnitude  and 
phase  of  the  centroid  of  this  swarm 

1 N 

^ = (12.122) 
iV  l=i 

The  centroid  of  the  ensemble  of  points  on  the  phase  diagram  has  a 
magnitude  r,  designating  the  offset  of  the  centroid  from  the  center  of 
the  circular  phase  diagram,  and  ip  which  is  the  phase  of  this  centroid. 

A uniform  distribution  of  points  around  the  unit  circle  will  lead  to  a 
centroid  r = 0.  Correlated  motion  leads  to  a bunching  of  the  points 
around  some  phase  value  leading  to  a non-zero  centroid  r and  angle 
ip.  If  the  swarm  acts  like  a fully-coupled  single  oscillator  then  r~l 
with  an  appropriate  phase  ip. 

The  Kuramoto  model[Kur75,  StrOO]  incorporates  Winfree’s 
intuition  by  mapping  the  limit  cycles  onto  a simple  circular  phase 
diagram  and  incorporating  the  long-term  dynamics  of  coupled  oscil- 
lators in  terms  of  the  relative  phases  for  a mean-field  system.  That 
is,  the  angular  velocity  of  the  phase  for  the  ith  oscillator  is 

N 

j>i=ui+^2Tio((t>j  - <t>i) 


(12.123) 


Figure  12.11:  Order  parameter  for 
weakly-coupled  oscillators. 
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Figure  12.12:  Kuramoto  model  of  collective  synchronization  of  coupled  oscillators.  The  left  and  center 
plots  show  the  time  and  coupling  strength  dependence  of  the  order  parameter  r.  The  right  plot  shows  the 
frequency  dependence  including  coupling  (solid  line)  and  without  coupling  (dashed  line). 


where  i = 1,2 ,,  ,N.  Kuramoto  recognized  that  mean- field  coupling  was  the  most  tractable  system  to  solve, 
that  is,  a system  where  the  coupling  is  applicable  equally  to  all  the  oscillators.  Moreover,  he  assumed  an 
equally- weighted,  pure  sinusoidal  coupling  for  the  coupling  term  T ^ (Oj  — 9f)  between  the  coupled  oscillators. 
That  is,  he  assumed 

r iMj  -<t>i)=I^  sinOj  - <t>i)  (12.124) 

where  K > 0 is  the  coupling  strength,  and  the  factor  A-  ensures  that  the  model  is  well  behaved  as  N — > oo. 
Kuramoto  assumed  that  the  frequency  distribution  g(ui)  was  unimodular  and  symmetric  about  the  mean 
frequency  f2,  that  is  g(Cl  + <*;)  = g(Ll  — i j). 

This  problem  can  be  simplified  by  exploiting  the  rotational  symmetry  and  transforming  to  a frame  of 
reference  that  is  rotating  at  an  angular  frequency  Q.  That  is,  use  the  transformation  9i  = 4>i  — fit  where 
9i  is  measured  in  the  rotating  frame.  This  makes  g(u>)  unimodular  with  a symmetric  frequency  distribution 
about  ui  = 0.  The  phase  velocity  in  this  rotating  frame  is 


9i=uii  + '52  — Bm(0j-0i)  (12.125) 

i= i iV 

Kuramoto  observed  that  the  phase-space  distribution  can  be  expressed  in  terms  of  the  order  parameters  r,  ip 
in  that  equation  12.122  can  be  multiplied  on  both  sides  by  e~l9i  to  give 


rei(V'-Si) 


_L  jOj-Si) 

N ^ 

3= 1 


Equating  the  imaginary  parts  yields 


r sin  (ip  — 9 i) 


1 . 

— ^ sin  (^  -6»i) 
i= i 


(12.126) 


(12.127) 


This  allows  equation  12.125  to  be  written  as 


9i  = uJi  + Krsm(ip  — 9i)  (12.128) 

for  i = 1,2,, IV.  Equation  12.128  reflects  the  mean-field  aspect  of  the  model  in  that  each  oscillator  9i  is 
attracted  to  the  phase  of  the  mean  field  ip  rather  than  to  the  phase  of  another  individual  oscillator. 

Simulations  showed  that  the  evolution  of  the  order  parameter  with  coupling  strength  K is  as  illustrated 
in  figure  12.12.  This  simulation  shows  (1)  for  all  A',  when  below  a certain  threshold  Kc,  the  order  parameter 
decays  to  an  incoherent  jitter  as  expected  for  random  scatter  of  N points.  (2)  When  K > Kc  this  incoherent 
state  becomes  unstable  and  the  order  parameter  r grows  exponentially  reflecting  the  nucleation  of  small 
clusters  of  oscillators  that  are  mutually  synchronized.  (3)  The  population  of  individual  oscillators  splits 
into  two  groups.  The  oscillators  near  the  center  of  the  distribution  lock  together  in  phase  at  the  mean 
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angular  frequency  f2  and  co-rotate  with  average  phase  ip(t) , whereas  those  frequencies  lying  further  from 
the  center  continue  to  rotate  independently  at  their  natural  frequencies  and  drift  relative  to  the  coherent 
cluster  frequency  O.  As  a consequence  this  mixed  state  is  only  partially  synchronized  as  illustrated  on  the 
right  side  of  figure  12.12.  The  synchronized  fraction  has  a (5-function  behavior  for  the  frequency  distribution 
which  grows  in  intensity  with  further  increase  in  K.  The  unsynchronized  component  has  nearly  the  original 
frequency  distribution  g(u)  except  that  it  is  depleted  in  the  region  of  the  locked  frequency  due  to  strength 
absorbed  by  the  (5-function  component. 

Kuramoto’s  toy  model  nicely  illustrates  the  essential  features  of  the  evolution  of  collective  synchronization 
with  coupling  strength.  It  has  been  applied  to  the  study  neuronal  synchronization  in  the  brain[Cum07].  The 
model  illustrates  that  the  collective  synchronization  of  coupled  oscillators  leads  to  a component  that  has  a 
single  frequency  for  correlated  motion  which  can  be  much  narrower  than  the  inherent  frequency  distribution 
of  the  ensemble  of  coupled  oscillators. 

12.12  Example:  Collective  motion  in  nuclei 

The  nucleus  is  an  unusual  quantal  system  that  involves  the  coupled  motion  of  the  many  nucleons.  It 
exhibits  features  characteristic  of  the  many-body  classical  coupled  oscillator  with  coupling  between  all  the 
valence  nucleons.  Nuclear  structure  can  be  described  by  a shell  model  of  individual  nucleons  bound  in  weakly 
interacting  orbits  in  a central  average  mean  field  that  is  produced  by  the  summed  attraction  of  all  the  nucleons 
in  the  nucleus.  However,  nuclei  also  exhibit  features  characteristic  of  collective  rotation  and  vibration  of  a 
quantal  fluid.  For  example,  beautiful  rotational  bands  up  to  spin  over  60 h are  observed  in  heavy  nuclei.  These 
rotational  bands  are  similar  to  those  observed  in  the  rotational  structure  of  diatomic  molecules.  Actinide 
nuclei  also  can  fission  into  two  large  fragments  which  is  another  manifestation  of  collective  motion. 

Figure  12.13  shows  the  case  of  collective  bands  in  23SU  populated  by  Coulomb  exciting  a 1 355MeV 
238U  beam  by  a 208Pb  target.  This  case  exhibits  both  quadrupole  and  octupole  collective  rotational  bands  up 
to  spin  40.  The  inset  shows  the  moment  of  inertia  plotted  versus  the  angidar  rotational  energy  hio.  The 
electromagnetic  E 2 transition  rates  correspond  to  collective  motion  of  « 32  nucleons.  Collective  motion  of 
many  nucleons  is  the  antithesis  of  shell  model  motion  where  the  nucleons  are  assumed  to  follow  independent 
orbiting  motion  like  planets  around  the  Sun.  Although  the  nucleus  is  a quantal  system,  this  strange  dichotomy 
can  be  understood  in  terms  of  a classical  rotating  system  having  weak  linear  coupling  between  each  of  many 
similar  harmonic  oscillators;  which  in  this  case,  are  nucleons  bound  in  a spheroidally- deformed  shell-model 
potential  well. 

The  essential  general  feature  of  weakly- coupled  identical  oscillators  is  illustrated  by  the  solutions  of  the 
three  linearly-coupled  identical  oscillators  where  the  most  symmetric  state  is  displaced  in  frequency  from  the 
remaining  states.  For  n identical  oscillators,  one  state  is  displaced  significantly  in  energy  from  the  remaining 
n — 1 degenerate  states.  This  most  symmetric  state  is  pushed  downwards  in  energy  if  the  residual  coupling 
force  is  attractive,  and  it  is  pushed  upwards  if  the  coupling  force  is  repulsive.  This  symmetric  state  corresponds 
to  the  coherent  oscillation  of  all  the  coupled  oscillators,  and  carries  all  of  the  strength  for  the  corresponding 
dominant  multipole  for  the  coupling  force.  In  the  nucleus  this  state  corresponds  to  coherent  shape  oscillations 
of  many  nucleons. 

The  weak  residual  electric  quadrupole  and  octupole  nucleon-nucleon  correlations  in  the  nucleon-nucleon 
interactions  generate  collective  quadrupole  and  octupole  motion  in  nuclei.  The  collective  synchronization 
of  such  coherent  quadrupole  and  octupole  excitation  leads  to  collective  bands  of  states,  that  correspond  to 
synchronized  in-phase  motion  of  the  protons  and  neutrons  in  the  valence  oscillator  shell.  These  modes 
correspond  to  rotations  and  vibrations  about  the  center  of  mass.  The  attractive  residual  nucleon-nucleon 
interaction  couples  the  many  individual  particle  excitations  in  a given  shell  producing  one  coherent  state 
that  is  pushed  downwards  in  energy  far  from  the  remaining  n — 1 degenerate  states.  This  coherent  state 
involves  correlated  motion  of  the  nucleons  that  corresponds  to  a macroscopic  oscillation  of  a charged  fluid. 
For  non-closed  shell  nuclei  like  238  U,  the  dominant  quadrupole  multipole  in  the  residual  nucleon-nucleon 
interaction  leads  to  the  ground  state  being  a coherent  state  corresponding  to  « 16  protons  plus  « 20  neutrons 
oscillating  in  phase.  The  collective  motion  of  the  charged  protons  leads  to  electromagnetic  E2  radiation 
with  a transition  decay  amplitude  being  about  16  times  larger  than  for  a single  proton.  This  corresponds  to 
radiative  decay  probability  being  enhanced  by  a factor  of  « 256  relative  to  radiation  by  a single  proton.  This 
collective  state  corresponds  to  a macroscopic  quadrupole  deformation  at  low  excitation  energies  that  exhibits 
both  collective  rotational  and  vibrational  degrees  of  freedom  as  shown  in  the  figure.  This  coherent  state  is 
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Figure  12.13:  Collective  rotational  bands  in  the  nucleus 


238U  excited  by  Coulomb  excitation.  [Sim98] 


analogous  to  the  correlated  flow  of  individual  water  molecules  in  a tidal  wave.  The  weaker  octupole  term  in 
the  residual  interaction  leads  to  an  octupole  [pear-shaped]  coupled  oscillator  coherent  state  lying  slightly  above 
the  quadrupole  coherent  state.  In  contrast  to  the  rotational  motion  of  strongly-deformed  quadrupole- deformed 
nuclei,  the  octupole  deformation  exhibits  more  vibrational-like  properties  than  rotational  motion  of  a charged 
tidal  wave.  The  observed  large  increase  in  moment  of  inertia  at  higher  rotational  frequencies,  shown  in  the 
insert,  is  due  to  the  Coiiolis  force  aligning  the  individual  valence  nucleons  along  the  rotational  axis.  Thus, 
although  the  nucleus  238  U is  the  epitome  of  a complicated  many-body  quantal  system,  it  is  apparent  that 
basic  classical  mechanics  of  coupled  oscillators,  and  rotation,  underlie  the  physics  phenomena  exhibited  by 
synchronized  collective  motion  in  the  nuclear  many-body  system. 

The  close  correspondence  between  classical  mechanics  predictions,  and  the  observed  excitation  phenomena 
observed  for  the  238  U nucleus,  is  surprising  for  a system  that  is  the  epitome  of  a many-body  quantal  fluid. 
The  following  list  identifies  other  manifestations  of  classical  mechanics  discussed  in  this  book,  that  play  a 
role  in  this  experimental  study. 

1.  Coincident  detection  of  the  excited  nuclei  recoiling  in  vacuum  was  used  to  identify  the  exact  scattering 
angles,  plus  recoil  velocities,  of  the  scattered  nuclei.  This  specifies  the  hyperbolic  Rutherford  trajectory 
for  each  scattered  nucleus,  the  nuclear  masses,  and  their  recoil  velocities.  The  deexcitation  7—  rays, 
emitted  in  flight  by  each  recoiling  nucleus,  were  detected  in  coincidence  with  the  scattered  nuclei.  Knowl- 
edge of  the  recoil  velocities  and  scattering  angles  enabled  correction  for  the  Doppler  shift  in  energy  of 
each  detected  coincident  j-ray . 

2.  The  transition  energies  and  angular  distribution  of  the  deexcitation  j-rays  determined  the  energies, 
spins,  and  parities  of  the  excited  states  in  235U. 

3.  The  measured  yields  of  the  coincident  deexcitation  j-rays  determined  the  excitation  cross  section  as  a 
function  of  the  nuclear  scattering  angle. 
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4.  A full  quantal  calculation  for  this  system  is  beyond  the  capabilities  of  modern  computers  since  the 
experiment  involves  excitation  of  ~ 100  excited  levels,  coupled  by  about  ~ 1000  electromagnetic  matrix 
elements,  and  the  scattering  involves  inclusion  of  thousands  of  partial  wave  due  to  the  long  range  of  the 
Coulomb  potential  and  the  heavy  mass  of  the  scattered  nuclei.  Therefore  a semi-classical  approximation 
is  used  for  the  quantal  calculation  of  the  electromagnetic  excitation  cross  sections  as  a function  of  time 
as  the  scattered  nuclei  traverse  Rutherford ’s  hyperbolic  Coidomb  scattering  trajectory  for  each  scattered 
nucleus. 

5.  The  measured  cross  section  for  the  deexcitation  7- rays  are  compared  with  the  predicted  cross  sections 
to  determine  the  ~ 1000  electromagnetic  matrix  elements  connecting  the  states  in  235U. 

6.  The  measured  electromagnetic  matrix  elements  have  been  measured  in  the  laboratory  frame  of  reference. 
Much  more  insight  into  the  collective  motion  in  235f7  is  obtained  by  transforming  the  electromagnetic 
matrix  elements  into  the  body-fixed  frame  of  reference  for  this  rotating  deformed  body.  Rotational 
invariants,  described  in  chapter  11.16,  are  used  to  derive  the  electromagnetic  properties  in  the  rotating 
body-fixed  frame  of  reference  which  unambiguously  determines  the  electromagnetic  shape  for  each  excited 
nuclear  state  observed  in  235  U. 

7.  Hamiltonian  mechanics,  based  on  the  Routhian  Rnoncyciic,  used  to  make  theoretical  model  calculations 
of  the  nuclear  structure  of  235  U in  the  rotating  body- fixed  frame  for  comparison  with  the  experimental 
data  derived  from  this  experiment. 

This  experiment  illustrates  that  classical  mechanics  plays  a key  role  in  all  aspects  of  the  study  of  the 
nuclear  structure  of  this  many-body  quantal  system. 


12.13  Summary 

This  chapter  has  focussed  on  many-body  coupled  linear  oscillator  systems  which  are  a ubiquitous  feature  in 
nature.  A summary  of  the  main  conclusions  are  the  following. 

Normal  modes:  It  was  shown  that  coupled  linear  oscillators  exhibit  normal  modes  and  normal  coordinates 
that  correspond  to  independent  modes  of  oscillation  with  characteristic  eigenfrequencies  W{. 


General  analytic  theory  for  coupled  linear  oscillators  Lagrangian  mechanics  was  used  to  derive  the 
general  analytic  procedure  for  solution  of  the  many-body  coupled  oscillator  problem  which  reduces  to  the 
conventional  eigenvalue  problem.  A summary  of  the  procedure  for  solving  coupled  oscillator  problems  is  as 
follows:. 

1)  Choose  generalized  coordinates  qj  and  evaluate  T and  U. 

1 n 

T (12.41) 

j,k 

and 

1 n 

U = o VjkQjQk  (12.42) 

j,k 

where  the  components  of  the  T and  V tensors  are 


N 


dxn 


'a  Y.  Y.  d,  dqk 

a.  1 J 


(12.43) 


and 


Vjk  = 


d 2U 


dqjdqk 


0 


(12.44) 
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2)  Determine  the  eigenvalues  uir  using  the  secular  determinant. 


Vn  - uj2Tn 

V12  — W2Ti2 
Vl3  — W2T13 


V12  — UJ2T\2 
V22  — (jJ2T22 
V23  — <jJ2T23 


V13  — UJ2T\3 
V23  ^ UJ2T23 
V33  — W2T3  3 


(12.52) 


3)  The  eigenvectors  are  obtained  by  inserting  the  eigenvalues  cor  into 

n 

Y {Vjk  - 0J2rTjk ) CLjr  = 0 (12.51) 

3 

4)  From  the  initial  conditions  determine  the  complex  scale  factors  /3r  where 

Vr  (t)  = PreiUrt  (12.58) 

5)  Determine  the  normal  coordinates  where  each  rjr  is  a normal  mode.  The  normal  coordinates  can  be 
expressed  as 

r?  = {a}1  q (12.61) 

Few-body  coupled  oscillator  systems  The  general  analytic  theory  was  used  to  determine  the  solutions 
for  parallel  and  series  couplings  of  two  and  three  linear  oscillators.  The  phenomena  observed  include  degen- 
erate and  non- degenerate  eigenvalues  and  spurious  center-of-mass  oscillatory  modes.  There  are  two  broad 
classifications  for  three  or  more  coupled  oscillators,  that  is,  either  complete  coupling  of  all  oscillators,  or 
coupling  of  the  nearest-neighbor  oscillators.  It  is  observed  that  the  eigenvalue  corresponding  to  the  most 
coherent  motion  of  the  coupled  oscillators  corresponds  to  the  most  collective  motion  and  its  eigenvalue  is  dis- 
placed the  most  in  energy  from  the  remaining  eigenvalues.  For  some  systems  this  coherent  collective  mode 
corresponded  to  a center-of-mass  motion  with  no  internal  excitation  of  the  other  modes,  while  the  other 
eigenvalues  corresponded  to  modes  with  internal  excitation  of  the  oscillators  such  that  the  center  of  mass 
is  stationary.  The  above  procedure  has  been  applied  to  two  classification  of  coupling,  complete  coupling  of 
many  oscillators,  and  nearest  neighbor  coupling.  Both  degenerate  and  spurious  center-of-mass  modes  were 
observed.  Strong  collective  shape  degrees  of  freedom  in  nuclei  are  examples  of  complete  coupling  due  to  the 
weak  residual  interactions  between  nucleons  in  the  nucleus.  It  was  seen  that,  for  many  coupled  oscillators, 
one  coherent  state  separates  from  the  other  states  and  this  coherent  state  carries  the  bulk  of  the  collective 
strength. 

Discrete  lattice  chain  Transverse  and  longitudinal  modes  of  motion  on  the  discrete  lattice  chain  were  dis- 
cussed because  of  the  important  role  it  plays  in  nature,  such  as  in  crystalline  lattice  structures.  Both  normal 
modes  and  travelling  waves  were  discussed  including  the  phenomena  of  dispersion  and  cut-off  frequencies. 
Molecules  and  the  crystalline  lattice  chains  are  examples  where  nearest  neighbor  coupling  is  manifest.  It 
was  shown  that,  for  the  n— oscillator  discrete  lattice  chain,  there  are  only  n independent  longitudinal  modes 
plus  n modes  for  the  two  transverse  polarizations,  and  that  the  angular  frequency  < 2ujo  that  is,  a cut-off 
frequency  exists. 

Damped  coupled  linear  oscillators  It  was  shown  that  linearly-damped  coupled  oscillator  systems  can 
be  solved  analytically  using  the  concept  of  the  Rayleigh  dissipation  function. 

Collective  synchronization  of  coupled  oscillators  The  Kuramoto  schematic  phase  model  was  used 
to  illustrate  how  weak  residual  forces  can  cause  collective  synchronization  of  the  motion  of  many  coupled 
oscillators.  This  is  applicable  to  biological  systems  as  well  as  mechanical  systems. 
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Workshop  exercises 


1.  Consider  two  masses  (each  of  mass  M ) connected  by  a spring  to  each  other  and  by  springs  to  fixed  positions. 
Motion  is  only  allowed  along  one  dimension.  (This  is  exactly  the  same  system  that  is  discussed  in  chapter  15.2 
of  the  lecture  notes  on  coupled  oscillations.)  Let  each  of  the  two  oscillator  springs  have  a force  constant  K and 
let  the  force  constant  of  the  coupling  spring  be  K\2-  Let  X\  and  x2  be  the  coordinates  as  described  in  the 
textbook. 

(a)  Draw  a picture  of  the  two  masses  displaced  by  a small  amount.  Using  the  picture,  try  to  make  sense  of 
the  equations  of  motion  as  given  in  the  text: 

Mx  1 + (k  + k')x\  — k'x2  = 0, ::::  Mx 2 + {k  + k')x2  — k'x\  = 0 

(b)  Each  of  the  trial  solutions  is  written  in  the  form  Belut.  Why  are  the  trial  solutions  written  this  way? 
Are  there  any  other  ways  to  write  the  trial  solution? 

(c)  For  a nontrivial  solution  to  exist  for  the  pair  of  simultaneous  equations  resulting  from  the  substitution  of 
the  trial  solution,  the  determinant  of  the  coefficients  of  B\  and  B2  must  vanish.  Why  must  this  be  the 
case?  Is  a similar  statement  true  when  considering  three  masses?  What  about  n masses? 

(d)  Suppose  you  had  the  actual  two-nrass  system  sitting  in  front  of  you.  How  could  you  create  antisymmetric 
motion?  How  could  you  create  symmetric  motion?  Can  you  describe  each  of  these  motions  using  a set  of 
suitable  initial  conditions? 


2.  Two  particles,  each  with  mass  to,  move  in  one  dimension  in  a region  near  a local  minimum  of  the  potential 
energy  where  the  potential  energy  is  approximately  given  by 

U = —k(  7x\  + 4x|  + 4x1X2) 

where  k is  a constant. 


3. 

4. 


(a)  Determine  the  frequencies  of  oscillation. 

(b)  Determine  the  normal  coordinates. 

What  is  degeneracy?  When  does  it  arise? 

The  Lagrangian  of  three  coupled  oscillators  is  given  by: 

mil  _ kxl~\ 

2 2 J + 

Find  X2 (?)  for  the  following  initial  conditions  (at  t = 0): 

(xi,x2,x3)  = (x0,  0,0), :::::: 


k'(x  ix2  + x2x3). 


(xi,x2,x3)  = (0,  0,  vq)- 


5.  A mechanical  analog  of  the  benzene  molecule  comprises  a discrete  lattice  chain  of  6 point  masses  M connected 
in  a plane  hexagonal  ring  by  6 identical  springs  each  with  spring  constant  K and  length  d. 

a)  List  the  wave  numbers  of  the  allowed  undamped  longitudinal  standing  waves. 

b)  Calculate  the  phase  velocity  and  group  velocity  for  longitudinal  travelling  waves  on  the  ring. 

c)  Determine  the  time  dependence  of  a longitudinal  standing  wave  for  a angular  frequency  u>  = 2 to  cutoff,  that 
is,  twice  the  cut-off  frequency. 

6.  Consider  a one  dimensional,  two-mass,  three-spring  system  governed  by  the  matrix  A, 


A = 


4 

-2 


-2 

7 


such  that  Ax  = w2x, 

(a)  Determine  the  eigenfrequencies  and  normal  coordinates. 

(b)  Choose  a set  of  initial  conditions  such  that  the  system  oscillates  at  its  highest  eigenfrequency. 

(c)  Determine  the  solutions  Xi(i)  and  x2(t). 
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Problems 

1.  Four  identical  masses  171  are  connected  by  four  identical  springs,  spring  constant  K,  and  constrained  to  move 
on  a frictionless  circle  of  radius  b as  shown  on  the  left  in  the  figure. 

a)  How  many  normal  modes  of  small  oscillation  are  there? 

b)  What  are  the  eigenfrequencies  of  the  small  oscillations? 

c)  Describe  the  motion  of  the  four  masses  for  each  eigenfrequency. 


2.  Consider  the  two  identical  coupled  oscillators  given  on  the  right  in  the  figure  assuming  K\  = k2  = K.  Let  both 
oscillators  be  linearly  damped  with  a damping  constant  /3.  A force  F = Focos(cut)  is  applied  to  mass  m 
Write  down  the  pair  of  coupled  differential  equations  that  describe  the  motion.  Obtain  a solution  by  expressing 
the  differential  equations  in  terms  of  the  normal  coordinates.  Show  that  the  normal  coordinates  IJi  and  7/2 
exhibit  resonance  peaks  at  the  characteristic  frequencies  ui  1 and  ui 2 respectively. 

m1=M  m2  = M 

ki=K  I k12  I k2  = K 

► ► 


3.  As  shown  on  the  left  below  the  mass  M moves  horizontally  along  a frictionless  rail.  A pendulum  is  hung  from 
M with  a weightless  rod  of  length  b with  a mass  m at  its  end. 

a)  Prove  that  the  eigenfrequencies  are 

wi  = 0 w2  = \ 


b)  Describe  the  normal  modes. 


Chapter  13 

Hamilton’s  principle  of  least  action 


13.1  Introduction 

In  two  papers  published  in  1834  and  1835,  Hamilton  announced  a dynamical  principle  upon  which  it  is 
possible  to  base  all  of  mechanics,  and  indeed  most  of  classical  physics.  Hamilton  was  seeking  a theory  of 
optics  when  he  developed  Hamilton’s  Principle,  plus  the  field  of  Hamiltonian  mechanics,  both  of  which  play 
a pivotal  role  in  classical  mechanics. 

Hamilton’s  Principle  is  based  on  defining  the  action  functional1  S of  the  n generalized  coordinates 
q and  their  corresponding  velocities  q. 

S — I L(q,  q,t)dt  (13.1) 

Jt  i 

The  scalar  quantity  S,  is  a functional  of  the  Lagrangian  L{ q,  q ,t).  In  principle,  higher  order  time  derivatives 
of  the  generalized  coordinates  could  be  included,  but  most  systems  in  classical  mechanics  are  described 
adequately  by  including  only  the  generalized  coordinates,  plus  their  velocities.  Note  that  the  definition  of 
the  action  functional  does  not  limit  the  specific  form  of  the  Lagrangian.  That  is,  it  allows  for  more  general 
Lagrangians  than  the  standard  Lagrangian  L(q,  q,f)  = T(q,f)  — C/(q,  t)  that  was  used  throughout  chapters 
5 — 12.  Hamilton  stated  that  the  actual  trajectory  of  a mechanical  system  is  given  by  requiring  that  the 
action  functional  is  stationary.  The  action  functional  is  stationary  if  the  variational  principle  is  written  in 
terms  of  virtual  infinitessimal  displacement  S to  be 


5S  = 5 L(q,  q,t)dt  = 0 (13.2) 

Jt! 

Typically  this  stationary  point  corresponds  to  a minimum  of  the  action  functional.  Applying  variational 
calculus  to  the  action  functional  leads  to  the  Lagrange  equations  of  motion  for  the  system.  That  is,  Hamilton’s 
Principle,  applied  to  the  Lagrangian  function  L(q,  q,i),  generates  the  Lagrangian  equations  of  motion. 

<13-3) 

These  Lagrange  equations  agree  with  those  derived  using  d’Alembert’s  Principle,  if  the  + 

Qfxc  generalized  force  terms  are  ignored. 

Hamilton’s  Principle  can  be  considered  to  be  the  fundamental  postulate  of  classical  mechanics.  It  replaces 
Newton’s  postulated  three  laws  of  motion.  As  illustrated  in  chapters  6 — 12,  Lagrangian  mechanics  based  on 
the  standard  Lagrangian  L = T — U,  provides  a remarkably  powerful  and  consistent  approach  to  solving  the 
equations  of  motion  in  classical  mechanics.  This  chapter  extends  the  discussion  to  non-standard  Lagrangians. 

Chapter  5.12  developed  a plausibility  argument,  based  on  Newton’s  laws  of  motion,  that  led  to  the 
Lagrange  equations  of  motion  using  the  standard  Lagrangian.  d’Alembert’s  Principle  of  virtual  work  was 
used  in  chapter  6 to  provide  a more  fundamental  derivation  of  Lagrange’s  equations  of  motion  which  was 
based  on  the  standard  Lagrangian.  An  important  feature  is  that  Hamilton’s  Principle  extends  Lagrangian 
mechanics  to  the  use  of  non-standard  Lagrangians. 

1The  term  action  functional  often  is  abbreviated  to  action.  It  is  called  Hamilton’s  Principal  Function  in  older  texts. 
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13.2  Principle  of  Least  Action 


Hamilton’s  crowning  achievement  was  deriving  both  Lagrangian  me- 
chanics, and  Hamiltonian  mechanics,  directly  in  terms  of  a general 
form  of  his  principle  of  least  action  S,  equation  13.2.  Consider  the 
action  SA  for  the  extremum  path  of  a system  in  configuration  space, 
that  is,  along  path  A from  coordinates  Qj(ti)  at  t = t\  to  q.j  {t2)  at 
t — O shown  in  figure  13.1,  where  j = 1,2,  ...,n  coordinates.  Then 
the  action  Sa  is  given  by 

SA=  f L(q(t),q(t),t)dt  (13.4) 

Jt! 

As  used  in  chapter  5.2,  a family  of  neighboring  paths  is  defined 
by  adding  an  infinitessimal  fraction  e of  a continuous,  well-behaved 
neighboring  function  r)j  where  e = 0 for  the  extremum  path. 

qj(t,e)  = qj(t,  0)  + erj^t)  (13.5) 

In  contrast  to  the  variational  case  discussed  when  deriving  La- 
grangian mechanics,  the  variational  path  used  here  does  not  assume 
that  the  functions  vanish  at  the  end  points.  Assume  that  the 
neighboring  path  B has  an  action  Sb  where 

SB=  L(q(t)+8q(t),q(t)+8q(t))dt  (13.6) 

Jt1+At 


Figure  13.1:  Extremum  path  A,  plus 
the  neighboring  path  B,  shown  in  con- 
figuration space. 


Expanding  the  integrand  of  Sb  in  equation  13.6  gives  that,  relative  to  the  extremum  path  A,  the  incremental 
change  in  action  is 

BT  \ 

—dqjjdt  + lLAtjll  (13.7) 

The  second  term  in  the  integral  can  be  integrated  by  parts  since  Sqj  = ^ leading  to 


rt  2 


SS  = SB- SA  = 


5S  = 


d 8L  \ _ 

mWS  qi  + 


r t a j. 

>■  Oi  S<U  + LM 

i J 


1 *2 


(13.8) 


Note  that  equation  13.8  includes  contributions  from  the  entire  path  of  the  integral  as  well  as  the  variations 
at  the  ends  of  the  curve  and  the  At  terms.  Equation  13.8  leads  to  the  following  two  pioneering  principles  of 
least  action  in  variational  mechanics  that  were  developed  by  Hamilton. 


13.2.1  Hamilton’s  Principle 

Derivation  of  Lagrangian  mechanics  in  chapter  6 was  based  on  the  extremum  path  for  neighboring  paths 
between  two  given  locations  q(ti)  and  qfe)  that  the  system  occupies  at  times  t\  aud  t2  respectively.  For 
this  special  case,  where  the  end  points  do  not  vary,  that  is,  when  5qi{t\)  = Sqi(t2)  = 0,  and  At-,  = A t2  = 0, 
then  the  least  action  SS  for  the  stationary  path  (13.8)  reduces  to 


8S  = 


d dL 
dt  dtjj 


6qjdt  = 0 


(13.9) 


For  independent  generalized  coordinates  Sq.j , the  integrand  in  brackets  vanishes  leading  to  the  Euler-Lagrange 
equations.  Conversely,  if  the  Euler-Lagrange  equations  in  13.9  are  satisfied,  then,  6S  = 0,  that  is,  the  path 
is  stationary.  This  leads  to  the  statement  that  the  path  in  configuration  space  between  two  configurations 
q(ti)  and  q(t2)  that  the  system  occupies  at  times  t\  and  t2  respectively,  is  that  for  which  the  action  S is 
stationary.  This  is  a statement  of  Hamilton’s  Principle. 
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13.2.2  Least-action  principle  in  Hamiltonian  mechanics 

Consideration  of  the  general  variation  of  the  least-action  path  leads  to  Hamilton’s  basic  equations  of  Hamil- 
tonian mechanics.  For  the  general  path,  the  integral  term  in  equation  13.8  vanishes  because  the  Euler- 
Lagrange  equations  are  obeyed  for  the  stationary  path.  Thus  the  only  remaining  non-zero  contributions  are 
due  to  the  end  point  terms,  which  can  be  written  by  defining  the  total  variation  of  each  end  point  to  be 


A qj  = Sqj  + qjAt 


(13.10) 


where  Sqt  and  q,  are  evaluated  at  t±  and  O-  Then  equation  13.8  reduces  to 


5S  = 


1 *2 


dL 


E o~Sq.'  + LAt 


x - DL 

?%A<b+ 


\ ^ dL  . 


At 


*2 


(13.11) 


Since  the  generalized  momentum  pj  = J^-,  then  equation  13.11  can  be  expressed  in  terms  of  the  Hamiltonian 
and  generalized  momentum  as 


5S 

dS_ 

dqj 


-HAt 


Pj 


= [P'Aq  — HAt}^ 


(13.12) 

(13.13) 


Equation  13.12  contains  Hamilton’s  Principle  of  Least-action.  Equation  13.13  gives  an  alternative  relation 
of  the  generalized  momentum  pj  that  is  in  terms  of  the  action  functional  S. 

Integrating  the  action  SS,  equation  13.11,  between  the  end  points  gives  the  action  for  the  path  between 
t = t\  and  t = t2,  that  is,  S(qj(ti),ti,Qj(t2),t2)  to  be 


S(qj(ti),t1,qj(t2),t2)  = J [p-  q-  H(q,p,t)]dt 


The  stationary  path  is  obtained  by  using  the  variational  principle 

r-2 


SS  = S j [p  ■ q — H (q,  p,t)]  dt  = 0 


(13.14) 


(13.15) 


The  integrand  in  the  modified  Hamilton’s  principle,  / = [p  ■ q Lf(q,  p,f)] , can  be  used  in  the  n Euler- 
Lagrange  equations  for  j = 1, 2, 3, ...,  n to  give 


dl 


dt  \dq 


dl 


dH 


. Pj  ^ 0 


dq 


dqj 


Similarly,  the  other  n Euler-Lagrange  equations  give 


dl 


dt  \ dpj  J d pj 


dl  dH  n 

- + ~ ~ - 0 


dPj 


(13.16) 


(13.17) 


Thus  Hamilton’s  principle  of  least-action  leads  to  Hamilton’s  equations  of  motion,  that  is  equations  13.16, 13.17.. 
The  total  time  derivative  of  the  action  S,  which  is  a function  of  the  coordinates  and  time,  is 


dS  dS  dS  . dS 

~dt~^i  + 2^Wjqj~~di  + PClj 


But  the  total  time  derivative  of  equation  13.15  equals 


dH 

dt 


p • q - tf(q,p,t) 


(13.18) 


(13.19) 
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Combining  equations  13.18  and  13.19  gives  the  Hamilton- Jacobi  equation  which  is  discussed  in  chapter  14.5. 

8S 

— +H(  q,p,i)  = 0 (13.20) 

In  summary,  Hamilton’s  principle  of  least  action  led  directly  to  Hamilton’s  equations  of  motion  (13.16, 13.17) 
plus  the  Hamilton- Jacobi  equation  (13.20).  Note  that  both  Hamilton’s  Principle  (13.8),  and  Hamilton’s  equa- 
tions of  motion  (13.16, 13.17),  have  been  derived  directly  from  Hamilton’s  concept  of  Least  Action  S without 
explicitly  invoking  the  Lagrangian. 


13.2.3  Abbreviated  action 


Hamilton’s  Principle  determines  completely  the  path  of  the  motion  and  the  position  on  the  path  as  a function 
of  time.  If  the  Lagrangian  and  the  Hamiltonian  are  time  independent,  that  is,  conservative,  then  H = E 
and  equation  13.14  equals 


S(qj{ti),ti,qj(t2),t2)  = J [p-q  -E\dt=j  p-Jq  - E(t2  - <i) 

The  f~  p ■ Sq  term  in  equation  13.21,  is  called  the  abbreviated  action  which  is  defined  as 


(13.21) 


Sn  = 


p-Sqdt  = / p-Jq 


(13.22) 


The  abbreviated  action  can  be  simplified  assuming  the  standard  Lagrangian  L = T — U has  a velocity- 
independent  potential  U,  then  equation  8.4  gives. 


Sq  = Y^Pjqjdt  = J^  (L  + H)dt  = j 2Tdt  = J 


= / p-^q 


(13.23) 


Abbreviated  action  provides  for  use  of  a simplified  form  of  the  principle  of  least  action  that  is  based 
on  the  kinetic  energy  and  not  potential  energy.  For  conservative  systems  it  determines  the  path  of  the 
motion,  but  not  the  time  dependence  of  the  motion.  Consider  virtual  motions  where  the  path  satisfies 
energy  conservation,  and  where  the  end  points  are  held  fixed,  that  is  Sqi  = 0,  but  allow  for  a variation  5t  in 
the  final  time.  Then  using  equation  13.21 


However,  equation  13.21  gives  that 
Therefore 


SS  = -HSt  = -ESt 


SS  = 5S0  - ESt 


(13.24) 

(13.25) 


SS0  = 0 (13.26) 

That  is,  the  abbreviated  action  has  a minimum  with  respect  to  all  paths  that  satisfy  the  conservation  of 
energy  which  can  be  written  as 


SS0  = S 2 Tdt  = 0 


(13.27) 


Equation  13.27  is  called  the  Maupertuis’  least-action  principle  which  he  proposed  in  1744  based  on  Fermat’s 
Principle  in  optics.  Credit  for  the  formulation  of  least  action  commonly  is  given  to  Maupertuis;  however,  the 
Maupertuis  principle  is  identical  to  use  of  least  action  applied  to  the  "vis  viva",  as  was  proposed  by  Leibniz 
four  decades  earlier.  Maupertuis  used  teleological  arguments,  rather  than  scientific  rigor,  because  of  his 
limited  mathematical  capabilities.  In  1744  Euler  provided  a scientifically  rigorous  argument,  presented  above, 
that  underlies  the  Maupertuis  principle.  Euler  derived  the  correct  variational  relation  for  the  abbreviated 
action  to  be 


SSo  = / = 0 


(13.28) 


Hamilton’s  use  of  the  principle  of  least  action  to  derive  both  Lagrangian  and  Hamiltonian  mechanics  is  a 
remarkable  accomplishment.  It  underlies  Hamiltonian  mechanics  and  confirmed  the  conjecture  of  Mauper- 
tuis. 
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13.3  Standard  Lagrangian 

Lagrangian  mechanics,  as  introduced  in  chapters  5,  6,  was  based  on  the  concepts  of  kinetic  energy  and 
potential  energy.  The  derivation  of  Lagrangian  mechanics,  given  in  chapter  6,  was  based  on  d’Alembert’s 
principle  of  virtual  work  which  led  to  the  definition  of  the  standard  Lagrangian.  The  standard  Lagrangian 
was  defined  in  chapter  6.2  to  be  the  difference  between  the  kinetic  and  potential  energies. 


L( q,  q,t)  = T(q,f)  -U{ q,  t ) 


(13.29) 


Hamilton  extended  Lagrangian  mechanics  by  defining  Hamilton’s  Principle,  equation  13.2,  which  states  that 
a dynamical  system  follows  a path  for  which  the  action  functional  is  stationary,  that  is,  time  integral  of  the 
Lagrangian.  Chapter  6 showed  that  using  the  standard  Lagrangian  in  the  action  functional  leads  to  the 
Euler-Lagrange  variational  equations 


a 

\ dt 


= Qfxc 


m T) 

^ d(L 


k= 1 


(13.30) 


The  Lagrange  multiplier  terms  handle  the  holonomic  constraint  forces  and  QfXC  handles  the  remaining 
excluded  generalized  forces.  Chapters  6 — 12  showed  that  the  use  of  the  standard  Lagrangian,  with  the 
Euler-Lagrange  equations  (13.3),  provides  a remarkably  powerful  and  flexible  way  to  derive  second-order 
equations  of  motion  for  dynamical  systems  in  classical  mechanics. 

Note  that  the  Euler-Lagrange  equations,  expressed  solely  in  terms  of  the  standard  Lagrangian  (13.29), 
that  is,  excluding  the  Qfxc  + Y^k=i  ^fc§|Mq>  t)  terms,  are  valid  only  under  the  following  conditions: 

1.  The  forces  acting  on  the  system,  apart  from  any  forces  of  constraint,  must  be  derivable  from  scalar 
potentials. 

2.  The  equations  of  constraint  must  be  relations  that  connect  the  coordinates  of  the  particles  and  may 
be  functions  of  time,  that  is,  the  constraints  are  holonomic. 

The  Qfxc  + JXi  Afc||k(q,  t)  terms  extend  the  range  of  validity  of  using  the  standard  Lagrangian  in  the 
Lagrange-Euler  equations  by  introducing  constraint  and  additional  force  explicitly. 

Chapters  6—12  exploited  Lagrangian  mechanics  based  on  use  of  the  standard  definition  of  the  Lagrangian. 
This  chapter  shows  that  the  powerful  Lagrangian  formulation,  using  the  standard  Lagrangian,  can  be  ex- 
tended to  include  alternative  non-standard  Lagrangians  that  may  be  applied  to  dynamical  systems  where 
use  of  the  standard  definition  is  inapplicable.  If  these  non-standard  Lagrangians  satisfy  Hamilton’s  Action 
Principle,  13.2,  then  they  can  be  used  with  the  Euler-Lagrange  equations  to  generate  the  correct  equations 
of  motion,  even  though  the  Lagrangian  may  have  no  direct  relation  to  the  kinetic  and  potential  energies 
as  is  the  case  for  the  standard  Lagrangian.  Currently,  the  development  and  exploitation  of  non-standard 
Lagrangians  is  an  active  field  of  Lagrangian  mechanics. 


13.4  Gauge  invariance  of  the  Lagrangian 

Note  that  the  standard  Lagrangian  is  not  unique  in  that  there  is  a continuous  spectrum  of  equivalent 
standard  Lagrangians  that  all  lead  to  identical  equations  of  motion.  This  is  because  the  Lagrangian  L is  a 
scalar  quantity  that  is  invariant  with  respect  to  coordinate  transformations.  The  following  transformations 
change  the  standard  Lagrangian,  but  leave  the  equations  of  motion  unchanged. 


1.  The  Lagrangian  is  indefinite  with  respect  to  addition  of  a constant  to  the  scalar  potential  which  cancels 
out  when  the  derivatives  in  the  Euler-Lagrange  differential  equations  are  applied. 

2.  The  Lagrangian  is  indefinite  with  respect  to  addition  of  a constant  kinetic  energy. 

3.  The  Lagrangian  is  indefinite  with  respect  to  addition  of  a total  time  derivative  of  the  form  L2  — ► 

Li  + [A(gj,t)] , for  any  differentiable  function  A(g.jf)  of  the  generalized  coordinates  plus  time,  that 

has  continuous  second  derivatives. 
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This  last  statement  can  be  proved  by  considering  a transformation  between  two  related  standard  La- 
grangians  of  the  form 


L2(q,q,t)  = £i(q  AH)  + 


^A(q,  t) 
dt 


Li(q,  q,t)  + 


( gA(q,  t) 

V dqj 


Qj 


QMh  t ) 

dt 


(13.31) 


This  leads  to  a standard  Lagrangian  L2  that  has  the  same  equations  of  motion  as  L\  as  is  shown  by 
substituting  equation  13.31  into  the  Euler-Lagrange  equations.  That  is, 


d / dL2\  dL-2  d { 0Lt  \ dLi  92A(q ,t)  d2A(q,t)  d / 0Lt  \ dLi 

dt  \ dqj  ) dqj  dt  \ dqj  ) dqj  dtdqj  dtdqj  dt  \ dqj  ) dqj 


Thus  even  though  the  L 1 and  L2  are  different,  they  are  completely  equivalent  in  that  they  generate  identical 
equations  of  motion. 

There  is  an  unlimited  range  of  equivalent  standard  Lagrangians  that  all  lead  to  the  same  equations  of 
motion  and  satisfy  the  requirements  of  the  Lagrangian.  That  is,  there  is  no  unique  choice  among  the  wide 
range  of  equivalent  standard  Lagrangians  expressed  in  terms  of  generalized  coordinates.  This  discussion  is 
an  example  of  gauge  invariance  in  physics. 

Modern  theories  in  physics  describe  reality  in  terms  of  potential  fields.  Gauge  invariance,  which  also  is 
called  gauge  symmetry,  is  a property  of  field  theory  for  which  different  underlying  fields  lead  to  identical 
observable  quantities.  Well-known  examples  are  the  static  electric  potential  field  and  the  gravitational 
potential  field  where  any  arbitrary  constant  can  be  added  to  these  scalar  potentials  with  zero  impact  on  the 
observed  static  electric  field  or  the  observed  gravitational  field.  Gauge  theories  constrain  the  laws  of  physics 
in  that  the  impact  of  gauge  transformations  must  cancel  out  when  expressed  in  terms  of  the  observables. 
Gauge  symmetry  plays  a crucial  role  in  both  classical  and  quantal  manifestations  of  field  theory,  e.g.  it  is 
the  basis  of  the  Standard  Model  of  electroweak  and  strong  interactions. 

Equivalent  Lagrangians  are  a clear  manifestation  of  gauge  invariance  as  illustrated  by  equations  13.31, 13.32 
which  show  that  adding  any  total  time  derivative  of  a scalar  function  A(q,t)  to  the  Lagrangian  has  no  ob- 
servable consequences  on  the  equations  of  motion.  That  is,  although  addition  of  the  total  time  derivative  of 
the  scalar  function  A(q,  t)  changes  the  value  of  the  Lagrangian,  it  does  not  change  the  equations  of  motion 
for  the  observables  derived  using  equivalent  standard  Lagrangians. 

In  Lagrangian  formulations  of  classical  mechanics,  the  gauge  invariance  is  readily  apparent  by  direct 
inspection  of  the  Lagrangian. 


13.1  Example:  Gauge  invariance  in  electromagnetism 

The  scalar  electric  potential  $ and  the  vector  potential  A fields  in  electromagnetism  are  examples  of  gauge- 
invariant  fields.  These  electromagnetic-potential  fields  are  not  directly  observable,  that  is,  the  electromagnetic 
observable  quantities  are  the  electric  field  E and  magnetic  field  B which  can  be  derived  from  the  scalar  and 
vector  potential  fields  T and  A.  An  advantage  of  using  the  potential  fields  is  that  they  reduce  the  problem 
from  6 components,  3 each  for  E and  B,  to  4 components,  one  for  the  scalar  field  $ and  3 for  the  vector 
potential  A.  The  Lagrangian  for  the  velocity-dependent  Lorentz  force,  given  by  equation  6.67,  provides  an 
example  of  gauge  invariance.  Equations  6.63  and  6.65  showed  that  the  electric  and  magnetic  fields  can  be 
expressed  in  terms  of  scalar  and  vector  potentials  $ and  A by  the  relations 


B = V x A 

<9A 
~dt 

The  equations  of  motion  for  a charge  q in  an  electromagnetic  field  can  be  obtained  by  using  the  Lagrangian 


E = -Vd> 


L = -mv  • v — — A ■ v) 

Consider  the  transformations  (A,$)  — > (A',d>')  in  the  transformed  Lagrangian  L' .where 


A'  = A + VA(r  ,t) 

dAjr  ,t) 
dt 
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The  transformed  Lorentz-force  Lagrangian  L'  is  related  it  to  the  original  Lorentz-force  Lagrangian  L by 


L'  = L + q 


r-VA(r  ,t)  + 


<9A(r  ,t) 
dt 


= I + q—A(r,t) 


Note  that  the  additive  term  q-^A(r,t)  is  an  exact  time  differential.  Thus  the  Lagrangian  L'  is  gauge  invariant 
implying  identical  equations  of  motion  are  obtained  using  either  of  these  equivalent  Lagrangians. 

The  force  fields  E and  B can  be  used  to  show  that  the  above  transformation  is  gauge-invariant.  That  is, 


E'  = - V$'  - = -V$  - ^ = E 

dt  dt 


B'  = VxA'  = VxA  = B 


That  is,  the  additive  terms  due  to  the  scalar  field  A(r  ,t)  cancel.  Thus  the  electromagnetic  force  fields  following 
a gauge-invariant  transformation  are  shown  to  be  identical  in  agreement  with  what  is  inferred  directly  by 
inspection  of  the  Lagrangian. 


13.5  Non-standard  Lagrangians 

The  definition  of  the  standard  Lagrangian  was  based  on  d’Alembert’s  differential  variational  principle.  The 
flexibility  and  power  of  Lagrangian  mechanics  can  be  extended  to  a broader  range  of  dynamical  systems 
by  employing  an  extended  definition  of  the  Lagrangian  that  is  based  on  Hamilton’s  Principle,  equation 
13.1.  Hamilton’s  Principle  was  introduced  46  years  after  the  standard  formulation  of  Lagrangian  mechanics. 
Hamilton’s  Principle  provides  a general  definition  of  the  Lagrangian  that  applies  to  standard  Lagrangians, 
which  are  expressed  as  the  difference  between  the  kinetic  and  potential  energies,  as  well  as  to  non-standard 
Lagrangians  where  there  may  be  no  clear  separation  into  kinetic  and  potential  energy  terms.  These  non- 
standard Lagrangians  can  be  used  with  the  Euler-Lagrange  equations  to  generate  the  correct  equations  of 
motion  even  though  they  may  have  no  relation  to  the  kinetic  and  potential  energies.  The  extended  definition 
of  the  Lagrangian  based  on  Hamilton’s  action  functional  13.1  can  be  exploited  for  developing  non-standard 
definitions  of  the  Lagrangian  that  may  be  applied  to  dynamical  systems  where  use  of  the  standard  definition 
is  inapplicable.  Non-standard  Lagrangians  can  be  equally  as  useful  as  the  standard  Lagrangian  for  deriving 
equations  of  motion  for  a system.  Secondly,  non-standard  Lagrangians,  that  have  no  energy  interpretation, 
are  available  for  deriving  the  equations  of  motion  for  many  nonconservative  systems.  Thirdly,  Lagrangians 
are  useful  irrespective  of  how  they  were  derived.  For  example,  they  can  be  used  to  derive  conservation  laws  or 
the  equations  of  motion.  Coordinate  transformations  of  the  Lagrangian  is  much  simpler  than  that  required 
when  using  the  equations  of  motion.  The  relativistic  Lagrangian  defined  in  chapter  16.6  is  a well-known 
example  of  a non-standard  Lagrangian. 


13.6  Inverse  variational  calculus 

Non-standard  Lagrangians  and  Hamiltonians  are  not  based  on  the  concept  of  kinetic  and  potential  energies. 
Therefore,  development  of  non-standard  Lagrangians  and  Hamiltonians  require  an  alternative  approach 
that  ensures  that  they  satisfy  Hamilton’s  Principle,  equation  13.2,  which  underlies  the  Lagrangian  and 
Hamiltonian  formulations.  One  useful  alternative  approach  is  to  derive  the  Lagrangian  or  Hamiltonian  via 
an  inverse  variational  process  based  on  the  assumption  that  the  equations  of  motion  are  known.  Helmholtz 
developed  the  field  of  inverse  variational  calculus  which  plays  an  important  role  in  development  of  non- 
standard Lagrangians.  An  example  of  this  approach  is  use  of  the  well-known  Lorentz  force  as  the  basis  for 
deriving  a corresponding  Lagrangian  to  handle  systems  involving  electromagnetic  forces.  Inverse  variational 
calculus  is  a branch  of  mathematics  that  is  beyond  the  scope  of  this  textbook.  The  Douglas  theorem  [Dou41] 
states  that,  if  the  three  Helmholtz  conditions  are  satisfied,  then  there  exists  a Lagrangian  that,  when  used 
with  the  Euler-Lagrange  differential  equations,  leads  to  the  given  set  of  equations  of  motion.  Thus,  it  will 
be  assumed  that  the  inverse  variational  calculus  technique  can  be  used  to  derive  a Lagrangian  from  known 
equations  of  motion 
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13.7  Dissipative  Lagrangians 


Energy  dissipation  is  an  irreversible  process  that  plays  an  important  role  for  most  physical  systems  encoun- 
tered in  nature.  This  irreversibility  contrasts  with  the  reversible  nature  of  the  basic  models  employed  to 
describe  conservative  systems.  Dissipation  for  an  observed  system  usually  arises  from  interactions  between 
the  observed  system  and  a bath  of  unobserved  systems  that  absorb  the  energy.  Usually  the  detailed  structure 
of  the  many  systems  that  absorb  the  dissipated  energy  is  irrelevant  for  the  calculation  of  the  dissipation. 
However,  calculation  of  the  interactions,  and  the  transition  from  reversibility  to  irreversibility,  are  challeng- 
ing problems  to  solve.  In  the  Newtonian  formulation  the  dissipation  can  be  handled  via  a phenomenological 
approach.  Unfortunately  incorporating  dissipative  processes  into  the  Lagrangian  and  Hamiltonian  varia- 
tional framework  is  more  difficult.  This  difficulty  stems  from  the  fact  that  these  variational  formulations 
were  derived  from  d’Alembert’s  principle  which  assumes  that  the  virtual  work  done  by  the  constraint  forces 
is  zero,  which  is  not  true  for  dissipative  forces. 

As  discussed  in  chapter  8.7,  the  following  three  approaches  can  be  used  to  introduce  dissipative  forces 
into  Lagrangian  mechanics. 


1.  The  dissipative  force  can  be  introduced  as  an  external  generalized  force  Qfxc. 

2.  For  the  special  case  of  linear  dissipation,  it  is  possible  to  use  the  Rayleigh  dissipation  function 

^ n n 

XlXX'Mi  (13-33) 

i= 1 3 = 1 


as  discussed  in  chapter  8.7.2.  Note  that 


2 T = 


dWf 

dt 


which  is  the  rate  of  energy  loss  due  to  the  dissipative  forces  involved. 


(13.34) 


3.  Extensions  of  Lagrangian  mechanics  using  non-standard  Lagrangians  can  be  used  that  build  dissipation 
directly  into  the  Lagrangian  This  can  allow  exploitation  of  Lagrangian  mechanics  for  a wide  range  of 
dissipative  systems. 


The  use  of  non-standard  Lagrangians  is  based  on  the  inverse  variational  problem  where  known  second- 
order  equations  of  motion,  plus  the  inverse  variational  approach,  are  used  to  derive  a Lagrangian  or  Hamil- 
tonian that  generates  the  assumed  equations  of  motion.  Non-standard  Lagrangians  can  have  very  different 
functional  dependences  on  q,  q,and  t compared  with  standard  Lagrangians,  and  yet  still  can  lead  to  the 
required  equations  of  motion,  the  generalized  momenta,  and  the  corresponding  Hamiltonian,  needed  to  solve 
problems  in  classical  mechanics.  The  reason  for  exploring  the  capabilities  of  use  of  non-standard  Lagrangians 
is  that  they  have  the  potential  to  eliminate  some  of  the  limitations  endemic  to  Lagrangian  and  Hamiltonian 
mechanics. 

Dissipation  plays  a prominent  role  in  the  burgeoning  field  of  non-linear  dynamical  systems  in  classical 
mechanics.  This  prominence  has  stimulated  recent  studies  of  the  applicability  of  standard,  and  non-standard, 
Lagrangians  to  a wide  range  of  dissipative  dynamical  systems.  Musielak  et  al,  and  others,  [Mus08a,  Mus08b, 
CeilO]  considered  dynamical  systems  that  were  described  by  equations  of  motion  with  first-order  time- 
derivative  dissipative  terms  of  even  and  odd  powers,  and  coefficients  varying  in  time  or  space.  They  found 
that  there  are  at  least  three  different  classes  of  equations  of  motion,  two  of  which  use  standard  Lagrangians 
and  can  be  classified  as  general.  However,  the  third  class  is  special  in  that  it  can  be  derived  only  using  non- 
standard Lagrangians.  Each  general  class  has  a subset  of  equations  with  non-standard  Lagrangians.  The 
existence  of  standard  Lagrangians  is  limited  to  equations  of  motion  with  either  time-dependent  coefficients 
plus  linear  dissipative  terms,  or  space-dependent  coefficients  and  quadratic  dissipative  terms.  However,  the 
equations  of  motion  that  can  be  derived  from  non-standard  Lagrangians  are  restricted  by  conditions  that  must 
be  satisfied  by  the  coefficients  and  functions  of  these  equations.  Although  these  non-standard  Lagrangians 
may  have  restricted  applicability,  they  do  provide  hope  that  such  techniques  can  be  used  to  broaden  the  scope 
of  problems  that  can  be  addressed  using  the  basic  Lagrangian  and  Hamiltonian  mechanics  formalisms.  Note 
that,  even  though  Lagrange  published  his  treatise  on  analytical  mechanics  in  1788,  fundamental  problems 
remain  to  be  solved  in  order  to  attain  the  full  potential  capabilities  of  analytical  mechanics. 
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13.8  Linear  velocity-dependent  dissipation 

As  discussed  in  chapter  15.8,  dissipative  forces  for  fluids  and  gases  depend  linearly  on  velocity  at  low  velocities, 
that  is,  for  Reynolds  numbers  Re  < 1.  Such  linear- velocity  dissipative  forces  occur  frequently  in  nature.  The 
wide  range  of  electrical  conductors  that  obey  Ohm’s  Law  provides  an  example  of  a dissipative  force  that 
depends  linearly  on  velocity.  By  contrast,  dissipative  forces  in  fluids  and  gases  at  high  velocities,  that 
is,  at  Reynolds  numbers  103  < Re  < 105,  have  dissipation  forces  that  depend  quadratically  on  velocity. 
Thus  Lagrangians,  or  Hamiltonians,  are  needed  that  can  account  for  dissipation  that  may  have  a non-linear 
dependence  on  velocity. 

The  special  case  of  linear  velocity-dependent  dissipation  is  used  below  to  illustrate  the  potential  capabil- 
ities of  standard  and  non-standard  Lagrangians  to  derive  the  equations  of  motion  for  dissipative  dynamical 
systems. 

Linear  velocity-dependent  dissipation  was  considered  by  Bauer  [Bau31]  who  stated  two  theorems  that 
show  that  the  equations  of  motion,  for  dissipative  dynamical  systems  having  a linear  dependence  on  velocity 
with  constant  coefficients,  cannot  be  derived  from  a variational  principle.  These  theorems  are: 

f . The  equations  of  motion  of  a conservative  linear  dynamical  system  are  given  by  a variational  principle 
only  if  the  masses  of  the  system  are  constant. 

2.  The  equations  of  motion  of  a dissipative  linear  dynamical  system  are  given  by  a variational  principle  if, 
and  only  if,  the  dissipation  coefficients  are  identically  equal  to  the  rates  of  change  of  the  corresponding 
masses. 

Bateman  [Bat  31]  pointed  out  that  an  isolated  dissipative  system  is  physically  incomplete,  that  is,  a com- 
plete system  must  comprise  at  least  two  coupled  subsystems  where  energy  is  transferred  from  a dissipating 
subsystem  to  an  absorbing  subsystem.  A complete  system  should  comprise  both  the  dissipating  and  ab- 
sorbing systems  to  ensure  that  the  total  system  Lagrangian  and  Hamiltonian  are  conserved,  as  is  assumed 
in  conventional  Lagrangian  and  Hamiltonian  mechanics.  Both  Bateman  and  Dekker[Dek75]  have  illustrated 
that  the  equations  of  motion  for  a linearly-damped,  free,  one-dimensional  harmonic  oscillator  are  derivable 
using  the  Hamilton  variational  principle  via  introduction  of  a fictitious  complementary  subsystem  that  ab- 
sorbs the  energy,  and  is  a function  of  a second  variable  that  mirrors  the  function  of  the  variable  for  the 
dissipative  subsystem  of  interest. 

Example  13.2,  illustrates  that  the  linearly-damped,  linear  oscillator  may  be  handled  by  three  alterna- 
tive equivalent  non-standard  Lagrangians  that  assume  either:  (1)  a multidimensional  system,  (2)  explicit 
time  dependent  Lagrangians  and  Hamiltonians,  or  (3)  complex  non-standard  Lagrangians,  to  generate  the 
equations  of  motion. 

13.2  Example:  The  linearly- damped,  linear  oscillator: 

Three  toy  dynamical  models  have  been  used  to  describe  the  linearly-damped,  linear  oscillator  employing 
very  different  non-standard  Lagrangians  to  generate  the  required  Hamiltonians,  and  to  derive  the  correct 
equations  of  motion. 

1:  Dual-component  Lagrangian:  Lr>uai 

Bateman  proposed  a dual  system  comprising  a mass  m subject  to  two  coupled  one- dimensional  variables 
(. x , y)  where  x is  the  observed  variable  and  y is  the  mirror  variable  for  the  subsystem  that  absorbs  the  energy 
dissipated  by  the  subsystem  x. 

Assume  a non-standard  Lagrangian  of  the  form 

[yx  - xy]  - Lo20xy  (a) 

where  T = ^ is  the  damping  coefficient.  Minimizing  by  variation  of  the  auxiliary  variable  y,  that  is,  A yL  = 0, 
leads  to  the  uncoupled  equation  of  motion  for  x 

(b) 


— [i  + Rd  + UqX  = 0 
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Similarly  minimizing  by  variation  of  the  primary  variable  x,  that  is  A XL  = 0,  leads  to  the  uncoupled  equation 
of  motion  for  y 


y [y  - r2/  + uly\  = 0 


(c) 


Note  that  equation  of  motion  (6),  which  was  obtained  by  variation  of  the  auxiliary  variable  y,  corresponds 
to  that  for  the  usual  free,  linearly-damped,  one- dimensional  harmonic  oscillator  for  the  x variable  which 
dissipates  energy  as  is  discussed  in  chapter  3.5.  The  equation  of  motion  (c)  is  obtained  by  variation  of  the 
primary  variable  x and  corresponds  to  a free  linear,  one-dimensional,  oscillator  for  the  y variable  that  is 
absorbing  the  energy  dissipated  by  the  dissipating  x system. 

The  generalized  momenta, 

dL 


Pi  = 


dcp 


can  be  used  to  derive  the  corresponding  Hamiltonian 


HDuai(x,px,y,py)  = [pxx  + pvy  — L\  = 


PxPy 
2 m 


2 \xp*  - yPv ] + y 


xy 


(d) 


Note  that  this  Hamiltonian  is  time  independent,  and  thus  is  conserved  for  this  complete  diLal-variable  system. 
Using  Hamilton's  equations  of  motion  gives  the  same  two  uncoupled  equations  of  motion  as  obtained  using 
the  Lagrangian,  i.e.  ( b ) and  (c). 

2:  Time-dependent  Lagrangian:  L^amped, 

The  complementary  subsystem  of  the  above  dual- component  Lagrangian,  that  is  added  to  the  primary 
dissipative  subsystem,  is  the  adjoint  to  the  equations  for  the  primary  subsystem  of  interest.  In  some  cases,  a 
set  of  the  solutions  of  the  complementary  equations  can  be  expressed  in  terms  of  the  solutions  of  the  primary 
subsystem  allowing  the  equations  of  motion  to  be  expressed  solely  in  terms  of  the  variables  of  the  primary 
subsystem.  Inspection  of  the  solutions  of  the  damped  harmonic  oscillator,  presented  in  chapter  3.5,  implies 
that  x and  y must  be  related  by  the  function 

y = xert  (e) 

Therefore  Bateman  proposed  a time-dependent,  non-standard  Lagrangian  L2  of  the  form 

L Damped  = ye™  [x2  - WqX2]  (/) 

This  Lagrangian  LDampes  corresponds  to  a harmonic  oscillator  for  which  the  mass  m = moe™  is  accreting 
exponentially  with  time  in  order  to  mimic  the  exponential  energy  dissipation.  Use  of  this  Lagrangian  in  the 
Eider- Lagrange  equations  gives  the  solution 

mert  [x  + Ti  + u^x]  = 0 (g) 


If  the  factor  outside  of  the  bracket  is  non-zero,  then  the  equation  in  the  bracket  must  be  zero.  The  expression 
in  the  bracket  is  the  required  equation  of  motion  for  the  linearly-damped  linear  oscillator.  This  Lagrangian 
generates  a generalized  momentum  of 

px  = mertx 


and  the  Hamiltonian  is 

HDamped  = PxX  - L2  = —e  + — U>Qe  X (h) 

The  Hamiltonian  is  time  dependent  as  expected.  This  leads  to  Hamilton’s  equations  of  motion 


x 

~Px 


9H Damped 

dpx 


Px  _r t 
m 


9H Damped 


dx 


2 Vt 

= mu0e  x 


Take  the  total  time  derivative  of  equation  h and  use  equation  i to  substitute  for  px  gives 


(i) 

(. 3 ) 


me™  [x  + ri:  + Wqx]  = 0 


(fc) 
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If  the  term  mert  is  non-zero,  then  the  term  in  brackets  is  zero.  The  term  in  the  bracket  is  the  usual  equation 
of  motion  for  the  linearly-damped  harmonic  oscillator. 

3:  Complex  Lagrangian:  LCompiex 

Dekker  proposed  use  of  complex  dynamical  variables  for  solving  the  linearly-damped  harmonic  oscillator. 
It  exploits  the  fact  that,  in  principle,  each  second  order  differential  equation  can  be  expressed  in  terms  of 
a set  of  first-order  differential  equations.  This  feature  is  the  essential  difference  between  Lagrangian  and 
Hamiltonian  mechanics.  Let  q be  complex  and  assume  it  can  be  expressed  in  the  form  of  a real  variable  x as 


q = x — 


Substituting  this  complex  variable  into  the  relation 


q + 


leads  to  the  second-order  equation  for  the  real  variable  x of 

x -\-  T x ui0  = 0 


(0 


(to) 


(n) 


This  is  the  desired  equation  of  motion  for  the  linearly- damped  harmonic  oscillator.  This  result  also  can  be 
shown  by  taking  the  time  derivative  of  equation  (to)  and  taking  only  the  real  part,  i.  e. 


r 

q + icoq  + -q  = q + 


q + Tq 


q + r<)  + WqX  = 0 


(o) 


This  feature  is  exploited  using  the  following  Lagrangian 


Lcompiex  — 2 (q  it  qq  ) 


(p) 


where  w2  = u)q  — (^)2.  The  Lagrangian  Lcompiex  is  real  for  a conservative  system  and  complex  for  a 
dissipative  system.  Using  the  Lagrange- Euler  equation  for  variation  of  q*,  that  is,  Ag*  Lcompiex  = 0,  gives 
equation  (to)  which  leads  to  the  required  equation  of  motion  ( n ). 

The  canonical  conjugate  momenta  are  given  by 


P = 


bjLQornple 


8q 


p = 


dLcomple 

dq* 


(q) 


The  above  Lagrangian  plus  canonically  conjugate 

Hc  omplex(Pi  0.1  Vi  Q ) 
H-ComplexiPi  Q.'iP')  Q ) 


momenta  lead  to  the  complimentary  Hamiltonians 

= {p*q*  -pq) 

= H _:!0  [p-q*  -pq) 


(*) 

(r) 


These  Hamiltonians  give  Hamilton  equations  of  motion  that  lead  to  the  correct  equations  of  motion  for  q 
and  q* 

The  above  examples  have  shown  that  three  very  different,  non-standard,  Lagrangians,  plus  their  corre- 
sponding Hamiltonians,  all  lead  to  the  correct  equation  of  motion  for  the  linearly-damped  harmonic  oscilla- 
tor. This  illustrates  the  power  of  using  non-standard  Lagrangians  to  describe  dissipative  motion  in  classical 
mechanics.  However,  postulating  non-standard  Lagrangians  to  produce  the  required  equations  of  motion 
appears  to  be  of  questionable  usefulness.  A fundamental  approach  is  needed  to  build  a firm  foundation  upon 
which  non-standard  Lagrangian  mechanics  can  be  based.  Non-standard  Lagrangian  mechanics  remains  an 
active,  albeit  narrow,  frontier  of  classical  mechanics 
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13.9  Summary 

This  chapter  introduced  Hamilton’s  use  of  least  action  to  derive  Hamilton’s  Principle,  and  its  application  to 
Lagrangian  and  Hamiltonian  mechanics.  Gauge  invariance  of  the  Lagrangian  was  discussed.  The  concept  of 
alternative  standard,  and  non-standard,  Lagrangians  was  introduced  and  their  applicability  was  illustrated. 
The  following  summarizes  the  conclusions. 


Hamilton’s  Principle  Hamilton’s  Principle  is  based  on  use  of  variational  calculus  to  determine  the  equa- 
tions of  motion  for  which  the  action  functional  S has  a stationary  solution,  where 


rt2 


s = 


L{  q,  q ,t)dt 


That  is 


f't-2 


SS  = 5 Ldt  = 0 


(13.1) 


(13.2) 


Hamilton’s  Principle  of  least  action  leads  directly  to  the  Lagrange-Euler  equations  without  assuming  that 
the  Lagrangian  is  of  the  standard  form.  That  is,  Hamilton’s  Principle  allows  for  a wide  range  of  allowable 
functional  forms  for  the  Lagrangian. 

Hamilton’s  Principle  leads  to  a direct  relation  between  the  generalized  momentum  and  the  action. 


Pi  = 


dS_ 

dq3 


(13.13) 


It  was  shown  that  Hamilton’s  Principle  of  least  action  predicts  Hamilton’s  equations  of  motion 

dH 


■ dH  n 

Pj  + = 0 

dqj 

In  addition,  it  predicts  the  Hamiltonian- Jacobi  equation. 

dS 


Vi  + o — — 0 
dPj 


dt 


+ H{q,p,t)  = 0 


(13.20) 


Gauge  invariance  of  the  standard  Lagrangian:  It  was  shown  that  there  is  a continuum  of  equivalent 
standard  Lagrangians  that  lead  to  the  same  set  of  equations  of  motion  for  a system.  This  feature  is  related 
to  gauge  invariance  in  mechanics.  The  following  transformations  change  the  standard  Lagrangian,  but  leave 
the  equations  of  motion  unchanged. 

1.  The  Lagrangian  is  indefinite  with  respect  to  addition  of  a constant  to  the  scalar  potential  which  cancels 
out  when  the  derivatives  in  the  Euler-Lagrange  differential  equations  are  applied. 

2.  Similarly  the  Lagrangian  is  indefinite  with  respect  to  addition  of  a constant  kinetic  energy. 

3.  The  Lagrangian  is  indefinite  with  respect  to  addition  of  a total  time  derivative  of  the  form  L —> 
L+  -jfi  [A(<ft,t)]  for  any  differentiable  function  A (qA)  of  the  generalized  coordinates,  plus  time,  that  has 
continuous  second  derivatives. 


Non-standard  Lagrangians:  The  flexibility  and  power  of  Lagrangian  mechanics  can  be  extended  to  a 
broader  range  of  dynamical  systems  by  employing  an  extended  definition  of  the  Lagrangian  that  is  allowed 
by  Hamilton’s  variational  action  principle,  equation  13.2.  It  was  illustrated  that  the  inverse  variational 
calculus  formalism  can  be  used  to  identify  non-standard  Lagrangians  that  generate  the  required  equations 
of  motion.  These  non-standard  Lagrangians  can  be  very  different  from  the  standard  Lagrangian  and  do  not 
separate  into  kinetic  and  potential  energy  components.  These  alternative  Lagrangians  can  be  used  to  handle 
dissipative  systems  which  are  beyond  the  range  of  validity  when  using  standard  Lagrangians.  That  is,  it 
was  shown  that  several  very  different  Lagrangians  and  Hamiltonians  can  be  equivalent  for  generating  useful 
equations  of  motion  of  a system.  Currently  the  use  of  non-standard  Lagrangians  is  a narrow,  but  active, 
frontier  of  classical  mechanics. 


Chapter  14 


Advanced  Hamiltonian  mechanics 


14.1  Introduction 

This  study  of  classical  mechanics  has  involved  climbing  a vast  mountain  of  knowledge,  while  the  pathway 
to  the  top  has  led  us  to  elegant  and  beautiful  theories  that  underlie  much  of  modern  physics.  Being  so 
close  to  the  summit  provides  the  opportunity  to  take  a few  extra  steps  in  order  to  glimpse  at  applications  of 
variational  techniques  to  physics  at  the  summit.  These  are  described  next  in  chapters  14  — 17. 

Hamilton’s  development  of  Hamiltonian  mechanics  in  1834  is  the  crowning  achievement  for  applying  vari- 
ational principles  to  classical  mechanics.  A fundamental  advantage  of  Hamiltonian  mechanics  is  that  it  uses 
the  conjugate  coordinates  q,  p,  plus  time  t,  which  is  a considerable  advantage  in  most  branches  of  physics 
and  engineering.  Compared  to  Lagrangian  mechanics,  Hamiltonian  mechanics  has  a significantly  broader 
arsenal  of  powerful  techniques  that  can  be  exploited  to  obtain  an  analytical  solution  of  the  integrals  of  the 
motion  for  complicated  systems.  In  addition,  Hamiltonian  dynamics  provides  a means  of  determining  the 
unknown  variables  for  which  the  solution  assumes  a soluble  form,  and  is  ideal  for  study  of  the  fundamental 
underlying  physics  in  applications  to  fields  such  as  quantum  or  statistical  physics.  As  a consequence,  Hamil- 
tonian mechanics  is  the  preeminent  variational  approach  used  in  modern  physics.  This  chapter  introduces 
the  following  four  techniques  in  Hamiltonian  mechanics:  (1)  the  elegant  Poisson  bracket  representation  of 
Hamiltonian  mechanics,  which  played  a pivotal  role  in  the  development  of  quantum  theory;  (2)  the  pow- 
erful Hamilton-Jacobi  theory  coupled  with  Jacobi’s  development  of  canonical  transformation  theory;  (3) 
action-angle  variable  theory;  and  (4)  canonical  perturbation  theory. 

Prior  to  further  development  of  the  theory  of  Hamiltonian  mechanics,  it  is  useful  to  summarize  the  major 
formula  relevant  to  Hamiltonian  mechanics  that  have  been  presented  in  chapters  7,  8,  and  13. 

Action  functional  S: 

As  discussed  in  chapter  13.2,  Hamiltonian  mechanics  is  built  upon  Hamilton’s  action  functional 

S(q,p,t)=[  L(q,q,t)dt  (14.1) 

Hamilton’s  Principle  of  least  action  states  that 

rt2 

SS(q,  p,t)  = 5 / L(q,  q,t)dt  = 0 (14.2) 


Generalized  momentum  p : 

In  chapter  7.2,  the  generalized  (canonical)  momentum  was  defined  in  terms  of  the  Lagrangian  L to  be 


Pi  = 


dL{ q,  q,f) 

den 


Chapter  13.2  defined  the  generalized  momentum  in  terms  of  the  action  functional  S to  be 

dS(q,p,t) 


Pj  = 


dqj 


(14.3) 


(14.4) 
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Generalized  energy  h(q,q,t)  ■ 

Jacobi’s  Generalized  Energy  h(q,q,t)  was  defined  in  equation  7.37  as 


K q,q,i)  = ^2 


Qi- 


dL( q,  q,  t) 
dq.j 


- L(q,q,t) 


(14.5) 


Hamiltonian  function: 

The  Hamiltonian  H (q,  p ,t)  was  defined  in  terms  of  the  generalized  energy  h( q,  q,  t)  plus  the  generalized 
momentum.  That  is 


H (q,p,t)  = ft(q,q,t)  = ^ pjqj  - T(q,q,t)  = p ■ q-L(q,q,t)  (14.6) 

3 


where  p,  q correspond  to  n-dimensional  vectors,  e.g.  q = (qi,  52,  qn)  and  the  scalar  product  p q = Yh%Piii- 
Chapter  8.2  used  a Legendre  transformation  to  derive  this  relation  between  the  Hamiltonian  and  Lagrangian 
functions.  Note  that  whereas  the  Lagrangian  L( q,  q,  t)  is  expressed  in  terms  of  the  coordinates  q,  plus 
conjugate  velocities  q,  the  Hamiltonian  H (q,  p,  t)  is  expressed  in  terms  of  the  coordinates  q plus  their 
conjugate  momenta  p.  For  scleronomic  systems,  plus  assuming  the  standard  Lagrangian,  then  equations 
7.44  and  7.29  give  that  the  Hamiltonian  simplifies  to  equal  the  total  mechanical  energy,  that  is,  H = T + U . 

Generalized  energy  theorem: 

The  equations  of  motion  lead  to  the  generalized  energy  theorem  which  states  that  the  time  dependence 
of  the  Hamiltonian  is  related  to  the  time  dependence  of  the  Lagrangian. 


dH  (q,p ,t) 
dt 


Qfxc 


+ EA* 

k= 1 


dgfc 

dqj 


in  t) 


dL{<\,  q,  t) 
dt 


(14.7) 


Note  that  if  all  the  generalized  non-potential  forces  and  Lagrange  multiplier  terms  are  zero,  and  if  the 
Lagrangian  is  not  an  explicit  function  of  time,  then  the  Hamiltonian  is  a constant  of  motion. 

Hamilton’s  equations  of  motion: 

Chapter  8.3  showed  that  a Legendre  transform  plus  the  Lagrange-Euler  equations  led  to  Hamilton’s 
equations  of  motion.  Hamilton  derived  these  equations  of  motion  directly  from  the  action  functional,  as 
shown  in  chapter  13.2. 


Qi 


Pi 

dH  (q,  p,t) 
dt 


dH  (q,  p,t) 
dPj 

dH  , 

(q,p ,t) 

dqj 

dL( q,  q,  t) 
dt 


dgk 

dqj 


+ Qfxc 


(14.8) 

(14.9) 
(14.10) 


Note  the  symmetry  of  Hamilton’s  two  canonical  equations.  The  canonical  variables  Pk,Qk  are  treated 
as  independent  canonical  variables.  Lagrange  was  the  first  to  derive  the  canonical  equations  but  he  did  not 
recognize  them  as  a basic  set  of  equations  of  motion.  Hamilton  derived  the  canonical  equations  of  motion 
from  his  fundamental  variational  principle  and  made  them  the  basis  for  a far-reaching  theory  of  dynamics. 
Hamilton’s  equations  give  2s  first-order  differential  equations  for  Pk,qk  for  each  of  the  s degrees  of  freedom. 
Lagrange’s  equations  give  s second-order  differential  equations  for  the  variables  qkAk- 
Hamilton- Jacobi  equation: 

Hamilton  used  Hamilton’s  Principle  to  derive  the  Hamilton-Jacobi  equation. 


— + #(  q,p,i)  = 0 (14.11) 

The  solution  of  Hamilton’s  equations  is  trivial  if  the  Hamiltonian  is  a constant  of  motion,  or  when  a set  of 
generalized  coordinate  can  be  identified  for  which  all  the  coordinates  qi  are  constant,  or  are  cyclic  (also  called 
ignorable  coordinates).  Jacobi  developed  the  mathematical  framework  of  canonical  transformation  required 
to  exploit  the  Hamilton-Jacobi  equation. 
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14.2  Poisson  bracket  representation  of  Hamiltonian  mechanics 

14.2.1  Poisson  Brackets 


Poisson  brackets  were  developed  by  Poisson,  who  was  a student  of  Lagrange.  Hamilton’s  canonical  equations 
of  motion  describe  the  time  evolution  of  the  canonical  variables  (q,p)  in  phase  space.  Jacobi  showed  that  the 
framework  of  Hamiltonian  mechanics  can  be  restated  in  terms  of  the  elegant  and  powerful  Poisson  bracket 
formalism.  The  Poisson  bracket  representation  of  Hamiltonian  mechanics  provides  a direct  link  between 
classical  mechanics  and  quantum  mechanics. 

The  Poisson  bracket  of  any  two  continuous  functions  of  generalized  coordinates  F(p,q)  and  G(p,q),  is 
defined  to  be 


[F,G\qp 


HFdG 

dqi  dpi 


(TFdG\ 

dpt  dqi ) 


(14.12) 


Note  that  the  above  definition  of  the  Poisson  bracket  leads  to  the  following  identity,  antisymmetry,  linearity, 
Leibniz  rules,  and  Jacobi  Identity. 

[F1F}=  0 (14.13) 


[F,G]  = -[G,F] 


(14.14) 


[G,F  + Y]  = [G,F]  + [G,Y] 


(14.15) 


[G,FY]  = [G,F]Y  + F[G,Y] 


(14.16) 


0 = [F,  [G,  Y]]  + [G,  [Y,  F\]  + [Y  [F,  G\]  (14.17) 

where  G,  H,  and  Y are  functions  of  the  canonical  variables  plus  time.  Jacobi’s  identity;  (14.17)  states  that 
the  sum  of  the  cyclic  permutation  of  the  double  Poisson  brackets  of  three  functions  is  zero.  Jacobi’s  identity 
plays  a useful  role  in  Hamiltonian  mechanics  as  will  be  shown. 


14.2.2  Fundamental  Poisson  brackets: 


The  Poisson  brackets  of  the  canonical  variables  themselves  are  called  the  fundamental  Poisson  brackets. 
They  are 


\Pk,Pl\qp  = J2 


dqi  dpi  dpi  dqi 

dp^dpi^  _ dpk  dpi 
dqi  dpi  dpi  dqi 


= ^ (o  • <yK  - ski  • o)  = o 


[«.«W  = E(  -»•»>  = * 


dqi  dpi  dpi  dq, 

In  summary,  the  fundamental  Poisson  brackets  equal 

[dk,qi\ qp  = o 

\Pk,Pi]qp  = o 

[Qk,Pl]qp  = - \PhQk]qp  = Ski 


kl 


(14.18) 

(14.19) 

(14.20) 

(14.21) 

(14.22) 

(14.23) 


Note  that  the  Poisson  bracket  is  antisymmetric  under  interchange  in  p and  q.  It  is  interesting  that  the  only 
non-zero  fundamental  Poisson  bracket  is  for  conjugate  variables  where  k = l,  that  is 


[QkiPklpq  — 1 


(14.24) 
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14.2.3  Poisson  bracket  invariance  to  canonical  transformations 


The  Poisson  brackets  are  invariant  under  a canonical  transformation  from  one  set  of  canonical  variables 
(qk, Pk)  to  a new  set  of  canonical  variables  ( Qk , Pk)  where  Qk  — > Qfc(q.  p)  and  Pk  — > Pk( q,  p).  This  is  shown 
by  transforming  equation  14.13  to  the  new  variables  by  the  following  derivation 


[F,G]qp 


V - dF  dG 

j \dQj  dpj  dPj  ()(/ , 

/ dF  / dG  dQk 

y*  V d'O  \dQk  dp. 


dG  dPk\ 
dPk  dpj  ) 


dF_  f dG  dQk 
dpj  \dQk  dqj 


dG_dPk\\ 

dPk  dqj  ) ) 


The  terms  can  be  rearranged  to  give 

(r)C  c)C  \ 

d Qk  + ~dP~k  ^ Pk^<iPj 

rv  7 


(14.25) 

(14.26) 


(14.27) 


Let  F = Qk  and  replace  G by  F,  and  use  the  fact  that  the  fundamental  Poisson  brackets  [ Qk,Qj]qp  = 0 
and  [Qk>Pj\qp  = Sjk,  then  equation  14.25  reduces  to 


[Qk,  P]qp 


(14.28) 


That  is 

Similarly 


[F,  Qk]  = - 


dF 

dPk 


[Pk,F]qp 


leading  to 


[P  Pk 


qp 


dF 

dQk 


Substituting  equations  (14.29)  and  (14.31)  into  equation  (14.27)  gives 


[PG]qp 


dF  dG 
dQk  dPk 


dF  dG  \ 

dPkdQk) 


= \PG]qp 


(14.29) 

(14.30) 


(14.31) 


(14.32) 


Thus  the  canonical  variable  subscripts  (q,p)  and  (Q,P)  can  be  ignored  since  the  Poisson  bracket  is 
invariant  to  any  canonical  transformation  of  canonical  variables.  The  counter  argument  is  that  if  the  Poisson 
bracket  is  independent  of  the  transformation  then  the  transformation  is  canonical. 


14.1  Example:  Check  that  a transformation  is  canonical 

The  independence  of  Poisson  brackets  to  canonical  transformations  can  be  used  to  test  if  a transformation 
is  canonical.  Assume  that  the  transformation  equations  between  two  sets  of  coordinates  are  given  by 

Q = In  (l  + q?  cos p)  P = 2 (\  + g5  cosp)  q 5 sinp 


Evaluating  the  Poisson  brackets  gives  [Q,  Q]  = 0,  [P,  P]  = 0 while 


[Q,P] 


dQdP  _ aP9Q 
dq  dp  dq  dp 

_ i 

q 2 COS p r . 2 I v 1 , 

1 [—(/sm  p + (1  + q2  cosp)q2  cospj 

I • f/  -’  cos  p 


q 2 sm  p r i . _i, 

j [cosp  + (1  + q2  cospjq  2\  = 1 

1 + q2  cos p 


Therefore  if  q,p  are  canonical  with  a Poisson  bracket  [q,p]  = 1,  then  so  are  Q,P  since  [Q,P]  = 1 = [q,p]  ■ 
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Since  it  has  been  shown  that  this  transformation  is  canonical,  it  is  possible  to  go  further  and  determine 
the  function  that  generates  this  transformation.  Solving  the  transformation  equations  for  q and  p give 

q = (e®  — l)“  sec2p  P = 2e®  ( e ^ — l)  tanp 

Since  the  transformation  is  canonical,  there  exists  a generating  function  F3  ( Q,p ) such  that 

dFs  dF3 

q dp  dQ 

The  transformation  function  F3  ( Q,p ) can  be  obtained  using 


dF3(Q,p)  = ^-dQ  + ^-dp  = -PdQ  - qdp 


= -d 


(e^  — l)Z  tanp  — (e^  — l)2 dtanp  = — d (e*^  — l)“ tan p 


This  then  gives  that  the  required  generating  function  is 


F3{Q,p ) = (eQ  - l)2  tanp 


This  example  illustrates  how  to  determine  a useful  generating  function  and  prove  that  the  transformation  is 
canonical. 


14.2.4  Correspondence  of  the  commutator  and  the  Poisson  Bracket 

In  classical  mechanics  there  is  a formal  correspondence  between  the  Poisson  bracket  and  the  commutator. 
This  can  be  shown  by  deriving  the  Poisson  Bracket  of  four  functions  taken  in  two  pairs.  The  derivation 
requires  deriving  the  two  possible  Poisson  Brackets  involving  three  functions. 


[F1F2,G\ 


dqj  dqj  J 


[F1,G}F2+F1[F2,G\ 


dG 

dPj 


dpj  dPj  J 


(14.33) 


[F,G!G2]  = [F,G1]G2  + G1[F,G2] 


(14.34) 


These  two  Poisson  Brackets  for  three  functions  can  be  used  to  derive  the  Poisson  Bracket  of  four  functions, 
taken  in  pairs.  This  can  be  accomplished  two  ways  using  either  equation  14.33  or  14.34. 

[F\F2,GiG2}  = [F1,G1G2\F2+F1[F2,G1G2] 

= {[Fi,  Gi]  G2  + Gi  [Fi,  G2]}  F2  + Fx  {[F2,  Gx]  G2  + G3  [F2,  G2}} 

= [F1,G1]G2F2+G1[F1,G2}F2  + F1[F2,G1}G2  + F1G1[F2,G2]  (14.35) 

The  alternative  approach  gives 

[FlF2,  GiG2]  = [F1F2G1]G2  + G1[F1F2,G2] 

= [F1,G1]F2G2+F1[F2,G1]G2  + G1[F1,G2]F2  + G1F1[F2,G2]  (14.36) 


These  two  alternate  derivations  give  different  relations  for  the  same  Poisson  Bracket.  Equating  the  alternative 
equations  14.35  and  14.36  gives  that 


[Fi,  Gf  ( F2G2  - G2F2)  = (FiG'i  - G1F1)  [F2,  G2] 

This  can  be  factored  into  separate  relations,  the  left-hand  side  for  body  1,  and  the  right-hand  side  for  body 

2'  {F1G1  - GxFQ  {F2G2  - G2F2) 

[Fi,  Gu]  [F2,G2] 


(14.37) 
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Since  the  left-hand  ratio  holds  for  F\ . G\  independent  of  and  vise  versa,  then  they  must  equal 

a constant  A that  does  not  depend  on  F\,G\,  does  not  depend  on  F2 , G2 , and  A must  commute  with 
(FiG\  — G\Fi).  That  is,  A must  be  a constant  number  independent  of  these  variables. 


{F1G1 


G1F1)  = A [Fi,  G\]  = A 


dF\  dG1 

dqi  dpi 


dF\  dG1  \ 
dpi  dqt  ) 


(14.38) 


Equation  14.38  is  an  especially  important  result  which  states  that  to  within  a multiplicative  constant  number 
A,  there  is  a one-to-one  correspondence  between  the  Poisson  Bracket  and  the  commutator  of  two  independent 
functions.  An  important  implication  is  that  if  two  functions,  FiGk  have  a Poisson  Bracket  that  is  zero,  then 
the  commutator  of  the  two  functions  also  must  be  zero,  that  is,  Fi  and  Gk  commute. 

Consider  the  special  case  where  the  variables  F\  and  G\  correspond  to  the  fundamental  canonical  vari- 
ables, ( qk,Pi )•  Then  the  commutators  of  the  fundamental  canonical  variables  are  given  by 


qkPi  -Piqk 
qkqi  - qiqk 
PkPi  - PiPk 


A [qk,Vi]  = A<5 ki 
A [qk,qi\  = 0 
A [pk,Pi]  = 0 


(14.39) 

(14.40) 

(14.41) 


In  1925,  Paul  Dirac,  a 23-year  old  graduate  student  at  Bristol,  recognized  that  the  formal  correspondence 
between  the  Poisson  bracket  in  classical  mechanics,  and  the  corresponding  commutator,  provides  a logical 
and  consistent  way  to  bridge  the  chasm  between  the  Hamiltonian  formulation  of  classical  mechanics,  and 
quantum  mechanics.  He  realized  that  making  the  assumption  that  the  constant  A = ih , leads  to  Heisenberg’s 
fundamental  commutation  relations  in  quantum  mechanics,  as  is  discussed  in  chapter  17.3.2.  Assuming  that 
A = ih  provides  a logical  and  consistent  way  that  builds  quantization  directly  into  classical  mechanics,  rather 
than  using  ad-hoc,  case-dependent,  hypotheses  as  was  used  by  the  older  quantum  theory  of  Bohr. 


14.2.5  Observables  in  Hamiltonian  mechanics 

Poisson  brackets,  and  the  corresponding  commutation  relations,  are  especially  useful  for  elucidating  which 
observables  are  constants  of  motion,  and  whether  any  two  observables  can  be  measured  simultaneously  and 
exactly.  The  properties  of  any  observable  are  determined  by  the  following  two  criteria. 


Time  dependence: 

The  total  time  differential  of  a function  G(qi,Pi,t ) is  defined  by 


dG 

dt 


Hamilton’s  canonical  equations  give  that 


Substituting  these  in  the  above  relation  gives 


di  = 


dH 

dpi 


Pi 


dH 

dqi 


dG 

dt 


dG 

~dt 


dG  dH 

dqi  dpi 


dG  dH  A 

dpi  dqt ) 


(14.42) 


(14.43) 

(14.44) 


that  is 


dG  dG 
— = — + [G,H] 


(14.45) 


This  important  equation  states  that  the  total  time  derivative  of  any  function  G(q,p,t)  can  be  expressed  in 
terms  of  the  partial  time  derivative  plus  the  Poisson  bracket  of  G(q,p,t)  with  the  Hamiltonian. 
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Any  observable  G(p,  q,  t ) will  be  a constant  of  motion  if  ^ = 0,  and  thus  equation  (14.45)  gives 


3G 

~dt 


[G,  H}  = 0 


(If  G is  a constant  of  motion) 


That  is,  it  is  a constant  of  motion  when 


dG 

~dt 


= [H,G] 


(14.46) 


Moreover,  this  can  be  extended  further  to  the  statement  that  if  the  constant  of  motion  G is  not  explicitly 
time  dependent  then 

[G,tf]=0  (14.47) 


The  Poisson  bracket  with  the  Hamiltonian  is  zero  for  a constant  of  motion  G that  is  not  explicitly  time 
dependent.  Often  it  is  more  useful  to  turn  this  statement  around  with  the  statement  that  if  [G,  H]  = 0,  and 
7^=0,  then  = 0,  implying  that  G is  a constant  of  motion. 


Independence 

Consider  two  observables  F(p,q,t ) and  G(p,q,t).  The  independence  of  these  two  observables  is  determined 
by  the  Poisson  bracket 

[F,G]  = -[G,F]  (14.48) 

If  this  Poisson  bracket  is  zero,  that  is,  if  the  two  observables  F(p,q,t)  and  G(p,q,t)  commute,  then  their 
values  are  independent  and  can  be  measured  independently.  However,  if  the  Poisson  bracket  [F,  G]  ^ 0,  that 
is  F(p,q,t)  and  G(p,q,t ) do  not  commute,  then  F and  G are  correlated  since  interchanging  the  order  of 
the  Poisson  bracket  changes  the  sign  which  implies  that  the  measured  value  for  F depends  on  whether  G is 
simultaneously  measured. 

A useful  property  of  Poisson  brackets  is  that  if  F and  G both  are  constants  of  motion,  then  the  double 
Poisson  bracket  [H,  [F,  G]]  = 0.  This  can  be  proved  using  Jacobi’s  identity 


[F,  [G,  H]\  + [G,  [H,  F]]  + [H,  [F,  G]]  = 0 


(14.49) 


If  [G,  H]  = 0 and  [F,  H\  = 0,  then  [H,  [F,  G]]  = 0,  that  is,  the  Poisson  bracket  [F,  G]  commutes  with  H.  Note 
that  if  F and  G do  not  depend  explicitly  on  time,  that  is  ^ ^ = 0,  then  combining  equations  (14.45) 

and  (14.49)  leads  to  Poisson’s  Theorem  that  relates  the  total  time  derivatives. 


dF 

dt 


, G 


+ 


dG 

dt 


(14.50) 


This  implies  that  if  F and  G are  invariants,  that  is  ^ = 0,  then  the  Poisson  bracket  [F,  G]  is  an 

invariant  if  F and  G are  not  explicitly  time  dependent. 


14.2  Example:  Angular  momentum: 

Angular  momentum,  L,  provides  an  example  of  the  use  of  Poisson  brackets  to  elucidate  which  observables 
can  be  determined  simultaneously.  Consider  that  the  Hamiltonian  is  time  independent  with  a spherically 
symmetric  potential  U(r).  Then  it  is  best  to  treat  such  a spherically  symmetric  potential  using  spherical 
coordinates  since  the  Hamiltonian  is  independent  of  both  9 and  <j>. 

The  Poisson  Brackets  in  classical  mechanics  can  be  used  to  tell  us  if  two  observables  will  commute.  Since 
U(r)  is  time  independent,  then  the  Hamiltonian  in  spherical  coordinates  is 


H =T+U = 


1 

2m 


+ U{r) 


Evaluate  the  Poisson  bracket  using  the  above  Hamiltonian  gives 


\P4nH]  =0 
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Since  p $ is  not  an  explicit  function  of  time,  = 0,  then  = 0,  that  is,  the  angular  momentum  about 
the  z axis  Lz  = p^  is  a constant  of  motion. 

The  Poisson  bracket  of  the  total  angular  momentum  L2  commutes  with  the  Hamiltonian,  that  is 


[L2,H] 


Pe  + 


= 0 


Since  the  total  angular  momentum  L2  = py  + is  not  explicitly  time  dependent,  then  it  also  must  be  a 
constant  of  motion.  Note  that  Noether ’s  theorem  also  gives  that  both  the  angidar  momenta  L 2 and  Lz  are 
constants  of  motion.  Also  since  the  Poisson  brackets  are 


[LZ,H]  = 0 
[L2,H]  = 0 

then  Jacobi's  identity,  equation  14.17,  can  be  used  to  imply  that 

[H,  [L2,Lz]]=  0 

That  is,  the  Poisson  bracket  [ L2,LZ ] is  a constant  of  motion.  Note  that  if  L2  and  Lz  commute,  that  is, 
[L2,Lz]  = 0,  then  they  can  be  measured  simultaneously  with  unlimited  accuracy,  and  this  also  satisfies  that 
[. L2,LZ ] commutes  with  H. 

The  ( x , y,  z)  components  of  the  angular  momentum  L are  given  by 

n n 

L X = Y (r  x p)*  = Y (yiPzJ  - ZiPvd) 

i= 1 2=1 

n n 

LV  = Y X p)y  = Y ~ XiP*,i) 

2=1  2=1 

n n 

lz  = Y (r  x p)* = Y Xipyi  ~ yip**) 

»=1  i=  1 

Evaluate  the  Poisson  bracket 

( dLx  dLy  _ dLx  dLv\  + / dLx  dLy  _ dLx  dLy  \ + / dLx  d Ly  _ d Lx  dLyY 

V dxi  dpxj  dpXti  dxi  ) \ dyi  dpVti  dplhl  dyi  ) \ dzt  dpZ)i  d pZji  dzi  J 

= Y + (°)  + (xiPyd  ~ ViPx,i )]  = Lz 
i=l 

Similarly,  Poisson  brackets  for  Lx,Ly,Lz  are 

\LX,  Ly\  Lz 
[Ly,  Lz]  = Lx 
[LZ,LX[  = Ly 

where  x,  y,  and  z are  taken  in  a right-handed  cyclic  order.  This  usually  is  written  in  the  form 

[Li,  Lj ] CijkLk 

where  the  Levi-Civita  density  eijk  equals  zero  if  two  of  the  ijk  indices  are  identical,  otherwise  it  is  +1  for  a 
cyclic  permutation  of  i,j,k,  and  —1  for  a non-cyclic  permutation. 

Note  that  since  these  Poisson  brackets  are  nonzero,  the  components  of  the  angidar  momentum  Lx,Ly,  Lz 
do  not  commute  and  thus  simultaneously  they  cannot  be  measured  precisely.  Thus  we  see  that  although  L2  and 
Li  are  simultaneous  constants  of  motion,  where  the  subscript  i can  be  either  x,  y,  or  z,  only  one  component 
Li  can  be  measured  simultaneously  with  L2.  This  behavior  is  exhibited  by  rigid-body  rotation  where  the  body 
precesses  around  one  component  of  the  total  angular  momentum,  Lz,  such  that  the  total  angular  momentum, 
L2 , plus  the  component  along  one  axis,  Lz  are  constants  of  motion.  Then  L2  + L2  = L2  — L2  is  constant 
bid  not  the  individual  Lx  or  Ly. 


[Lx,Ly]  — Y^ 
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14.2.6  Hamilton’s  equations  of  motion 

An  especially  important  application  of  Poisson  brackets  is  that  Hamilton’s  canonical  equations  of  motion 
can  be  expressed  directly  in  the  Poisson  bracket  form.  The  Poisson  bracket  representation  of  Hamiltonian 
mechanics  has  important  implications  to  quantum  mechanics  as  will  be  described  in  chapter  17. 

In  equation  (14.45)  assume  that  G is  a fundamental  coordinate,  that  is,  G = qk,.  Since  qk  is  not  explicitly 
time  dependent,  then 


dqu 

dt 


dqk  8H 
dqi  dpi 


£(*4s-  0 


_ dqk  dH\ 
dpi  dqi  J 
dH\ 
dqi) 


dH 

dpk 


(14.51) 


(14.52) 


That  is 


• r m dH 
qk  = qk,H\  = — 
dpk 


(14.53) 


Similarly  consider  the  fundamental  canonical  momentum  G = pk ■ Since  it  is  not  explicitly  time  dependent, 
then 


dpk 

dt 


dpk 

dt 


+ \pk,H] 


dpk  dH 

dqi  dpi 


dH 

dqk 


dpk  dH  \ 

dpt  dqi  J 

dH\ 

dqi) 


(14.54) 


(14.55) 


That  is 


Thus,  it  is 
of  motion. 


Pk  = [Pk,H]  = (14.56) 

dqk 

seen  that  the  Poisson  bracket  form  of  the  equations  of  motion  includes  the  Hamilton  equations 
That  is, 


qk 

Pk 


[qk,  H] 
[Pk,  H] 


dH 

dpk 

dH 

dqk 


(14.57) 

(14.58) 


The  above  shows  that  the  full  structure  of  Hamilton’s  equations  of  motion  can  be  expressed  directly  in 
terms  of  Poisson  brackets. 

The  elegant  formulation  of  Poisson  brackets  has  the  same  form  in  all  canonical  coordinates  as  the  Hamil- 
tonian formulation.  However,  the  normal  Hamilton  canonical  equations  in  classical  mechanics  assume  implic- 
itly that  one  can  specify  the  exact  position  and  momentum  of  a particle  simultaneously  at  any  point  in  time 
which  is  applicable  only  to  classical  mechanics  variables  that  are  continuous  functions  of  the  coordinates, 
and  not  to  quantized  systems.  The  important  feature  of  the  Poisson  Bracket  representation  of  Hamilton’s 
equations  is  that  it  generalizes  Hamilton’s  equations  into  a form  (14.57, 14.58)  where  the  Poisson  bracket  is 
equally  consistent  with  both  classical  and  quantum  mechanics  in  that  it  allows  for  non-connnuting  canonical 
variables  and  Heisenberg’s  Uncertainty  Principle.  Thus  the  generalization  of  Hamilton’s  equations,  via  use 
of  the  Poisson  brackets,  provides  one  of  the  most  powerful  analytic  tools  applicable  to  both  classical  and 
quantal  dynamics.  It  played  a pivotal  role  in  derivation  of  quantum  theory  as  described  in  chapter  17. 
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14.3  Example:  Lorentz  force  in  electromagnetism 

Consider  a charge  q,  and  mass  to,  in  a constant  electromagnetic  fields  with  scalar  potential  4>  and  vector 
potential  A.  Chapter  6.11  showed  that  the  Lagrangian  can  be  written  as 

L = -mx  ■ x— — A • x) 

The  generalized  momentum  then  is  given  by 


P = 


— =mx  + q A 
ax 


Thus  the  Hamiltonian  can  be  written  as 


H = (p  • x)  — L — 


(P~<?A)2 
2 TO 


The  Hamilton  equations  of  motion  give 


x=  [x,  H)  = 


(p-gA) 


TO 


and 


p = [p ,H\  = + A {(p— <?A)  x (V  x A)} 


Define  the  magnetic  field  to  be 
and  the  electric  field  to  be 

then  the  Lorentz  force  can  be  written  as 


TO 

BeVxA 

<9A 


E = — V4>  — 


dt 


F = p=g(E  + xxB) 


14.4  Example:  Wavemotion: 

Assume  that  one  is  dealing  with  traveling  waves  of  the  form  ^ = Ael(™xPx~ut}  for  a one- dimensional 
conservative  system  of  many  identical  coupled  linear  oscillators.  Then  evaluating  the  following  Poisson 
brackets  gives 


\ Px,H } = 0 

[x,  H]  = 0 
[w,  H]  = 0 
[t,H]  = 0 

Thus  px,x,oj,  and  t are  constants  of  motion.  However, 

[Px  5 f f 0 

K t\  f o 

Thus  one  cannot  simultaneously  measure  the  conjugate  variables  ( pxx ) or  (u>,t).  This  is  the  Uncertainty 
Principle  manifest  by  all  forms  of  wave  motion  in  classical  and  quantal  mechanics  as  discussed  in  chapter 

3.11.3. 
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14.5  Example:  Two-dimensional,  anisotropic,  linear  oscillator 

Consider  a mass  m bound  by  an  anisotropic,  two-dimensional,  linear  oscillator  potential.  As  discussed 
in  chapter  9,  the  motion  can  be  described  as  lying  entirely  in  the  x — y plane  that  is  perpendicular  to  the 
angular  momentum  J.  It  is  interesting  to  derive  the  equations  of  motion  for  this  system  using  the  Poisson 
bracket  representation  of  Hamiltonian  mechanics. 

The  kinetic  energy  is  given  by 

T{x,y ) = ^m(x2  +y2) 

The  linear  binding  is  reproduced  assuming  a quadratic  scalar  potential  energy  of  the  form 

U (*,  y)  = \k  ( x 2 + y2)  + yxy 

where  y is  the  anharmonic  strength  that  coupled  the  modes  of  the  isotropic  linear  oscillator. 

a)  NORMAL  MODES:  As  discussed  in  chapter  12,  a transformation  to  the  normal  modes  of  the  system 
is  given  by  using  variables  (a,  (3)  where  a = (x  + y)  and  (3  = -^=  (x  — y),  that  is 

X=^(a  + P)  y=^=(a-0) 

Express  the  kinetic  and  potential  energies  in  terms  of  the  new  coordinates  gives 


T(x,y ) = -to 


(a  + /?)  + (a  - /?)  = ^to  (a2  + (32 j 


U 


K 

4 


(a  + (3)2  + (a-  f3)2  + ^y  ( a 2 - (32)  = ^ (k  + y)  a2  + | (k  - y)  01 


Note  that  the  coordinate  transformation  makes  the  Lagrangian  separable,  that  is 


L=2 


^to  (a2  + /?2)  - ^ (k  + y)  a2  + ^ {k  - y)  /32  = La  + Lp 


where 


La  = | ma 2 -\{k  + y)a2  Lp  = ^mft2  - i (k  - y)  (32 

This  shows  that  that  the  transformation  has  separated  the  system  into  two  normal  modes  that  are  harmonic 
oscillators  with  angular  frequencies 


Wi  = 


k + y 
m 


2 — 


k — y 

TO 


Note  that  non-isotropic  harmonic  oscillator  reduces  to  the  isotropic  linear  oscillator  when  y = 0. 
b)  HAMILTONIAN:  The  canonical  momenta  are  given  by 

dL 

pa  = — = ma 

oa 

dL  h 
Ptt  = ^ = mP 

op 

The  definition  of  the  Hamiltonian  gives 

H = paa  + pp(3  — L = — (p2a  + p2p)  + ]-  (k  + y)  a2  + \ {k  - y)  /32 


Note  that  this  can  be  factored  as 


2 to  2 

H = Ha+  Hp 


Ha  = ^+l^k  + r])a2 


where 
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Using  the  Poisson  Bracket  expression  for  the  time  dependence,  equation  14.45,  and  using  the  fact  that 
the  Hamiltonian  is  not  explicitly  time  dependent,  that  is,  = 0,  gives 


dHa 

dt 


-^  + [Ha,H}=0  + [Ha,Ha  + Hg]  = [Ha,Hp] 

dH^dHp  dHa  dHp  dHa  dHp  c Wa  dHp 
da  dpa  d/3  dpp  dpa  da  dpp  d/3 


Similarly  = 0.  This  implies  that  the  Hamiltonians  for  both  normal  modes,  Ha  and  Hp,  are  time- 
independent  constants  of  motion  which  are  equal  to  the  total  energy  for  each  mode. 

c)  ANGULAR  MOMENTUM:  The  angular  momentum  for  motion  in  the  a/3  plane  is  perpendicular  to 
the  a/3  with  a magnitude  of 

J = m ( app  - ppa) 

The  time  dependence  of  the  angular  momentum  is  given  by 

dJ  _ dJ_  (MdH__dJ_dH_  dJ  dH  dJ  dH 

dt  dt  ^ da  dpa  dpa  da  dp  dpp  dpp  dp 

= PpPa  + mkPa  + mr/pa  — papp  — mkap  + mr/pa  = 2mr/Pa 


Note  that  if  77  = 0,  then  the  two  eigenfrequencies,  are  degenerate,  uia  = ujp,  that  is,  the  system  reduces  to 
the  isotropic  harmonic  oscillator  in  the  ap  plane  that  was  discussed  in  chapter  9.9.  In  addition,  j/jr  = 0 for 
77  = 0,  that  is,  the  angular  momentum  J in  the  ap  plane  is  a constant  of  motion  when  77  = 0. 
d)  SYMMETRY  TENSOR:  The  symmetry  tensor  was  defined  in  chapter  9.9.3  to  be 


AC 


PiPj 

2m 


\kxiXj 

2 


where  i and  j can  correspond  to  either  a or  p.  The  symmetry  tensor  defines  the  orientation  of  the  major 
axis  of  the  elliptical  orbit  for  the  two-dimensional,  isotropic,  linear  oscillator  as  described  in  chapter  9.9.3. 

The  isotropic  oscillator  has  been  shown  to  have  two  normal  modes  that  are  degenerate,  therefore  a and 
P are  equally  good  normal  modes.  The  Hamiltonian  showed  that,  for  77  = 0,  the  Hamiltonian  gives  the  total 
energy  is  conserved,  as  well  as  the  energies  for  each  of  the  two  normal  modes  which  are. 


Ea  = ^ + \ka2 
2m  2 


E =P± 

13  2m 


Consider  the  matrix  element 

A'  _ PfPj_  | 1 
ij  2m  + 2 

where  i,j  each  can  represent  a or  p.  Then  for  each  matrix  element 


:kxiXj 


dA' 

h 

dt 


^+[A.J,ff]=0+a4«M 


d AY  dH 


dAU  dH 


dt 


dA'ij  dH  _ Q 


da  dpa  dpa  da  dp  dpp  dpp  dp 


That  is,  each  matrix  element  A'12,  commutes  with  the  Hamiltonian 

[AU,H]=0 

Thus  the  Poisson  Brackets  representation  of  Hamiltonian  mechanics  has  been  used  to  prove  that  the 
symmetry  tensor  AU  = + \kxiXj  is  a constant  of  motion  for  the  isotropic  harmonic  oscillator.  That  is, 

all  the  elements  A!aa  , App,  and  A'ap  of  the  symmetric  tensor  A'  commute  with  the  Hamiltonian. 

Note  that  the  three  constants  of  motion,  L,  A'  and  H for  the  isotropic,  two-dimensional,  linear  oscillator 
form  a closed  algebra  under  the  Poisson  Bracket  formalism. 


14.6  Example:  The  eccentricity  vector 

Chapter  9.8.4  showed  that  Hamilton’s  eccentricity  vector  for  the  inverse  square-law  attractive  force, 

A = (p  x L)  + (ykr) 
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is  a constant  of  motion  that  specifies  the  major  axis  of  the  elliptical  orbit.  The  eccentricity  vector  for  the 
inverse-square-law  force  can  be  investigated  using  Poisson  Brackets  as  was  done  for  the  symmetry  tensor 
above.  It  can  be  shown  that 


I h 7 ■ A j J — e,j  f-  A k 

[A,;,  Aj ] = 2 4“  “ CijkLk  (a) 

Note  that  the  bracket  on  the  right-hand  side  of  equation  ( a ) equals  the  Hamiltonian  H for  the  inverse  square- 
law  attractive  force,  and  thus  the  Poisson  bracket  equals 


/ p2 

[A, , Aj]  — 2 ( — T — I CijkLk  — 2HeijkLk 


For  the  Hamiltonian  H it  can  be  shown  that  the  Poisson  bracket 


[H,  A]  = 0 

That  is,  the  eccentricity  vector  commutes  with  the  Hamiltonian  and  thus  it  is  a constant  of  motion.  Previously 
this  result  was  obtained  directly  using  the  equations  of  motion  as  given  in  equation  9.87.  Note  that  the  three 
constants  of  motion,  L,  A and  H form  a closed  algebra  under  the  Poisson  Bracket  formalism  similar  to 
the  triad  of  constants  of  motion,  L,  A'  and  H that  occur  for  the  two-dimensional , isotropic  linear  oscillator 
described  above.  Examples  14.5  and  14.6  illustrate  that  the  Poisson  Brackets  representation  of  Hamiltonian 
mechanics  is  a powerful  probe  of  the  underlying  physics,  as  well  as  confirming  the  results  obtained  directly 
from  the  equations  of  motion  as  described  in  chapter  9.8.4  and  9.9.3. 


14.2.7  Liouville’s  Theorem 

Liouvilles  Theorem  illustrates  the  application  of  Poisson  Brack- 
ets to  Hamiltonian  phase  space  which  has  important  implications 
for  statistical  physics.  The  trajectory  of  a single  particle  in  phase 
space  is  completely  determined  by  the  equations  of  motion  if  the 
initial  conditions  are  known.  However,  many-body  systems  have 
so  many  degrees  of  freedom  it  becomes  impractical  to  solve  all 
the  equations  of  motion  of  the  many  bodies.  An  example  is  a 
statistical  ensemble  in  a gas,  a plasma,  or  a beam  of  particles. 

Usually  it  is  not  possible  to  specify  the  exact  point  in  phase  space 
for  such  complicated  systems,  however,  it  is  possible  to  define  an 
ensemble  of  points  in  phase  space  that  encompasses  all  possible 
trajectories  for  the  complicated  system.  That  is,  the  statistical 
distribution  of  particles  in  phase  space  can  be  specified. 

Consider  a density  p of  representative  points  in  (q,  p)  phase 
space.  The  number  N of  systems  in  the  volume  element  dv  is 

N = pdv  (14.59) 

where  it  is  assumed  that  the  infinitessimal  volume  element 
dv  = dqi1dq2....dqsdp\,dp2....dps  contains  many  possible  sys-  Figure  14.1:  Infinitessimal  element  of  area 
terns  so  that  p can  be  considered  a continuous  distribution.  For  in  phase  space 
the  conjugate  variables  ( qi,Pi ) shown  in  figure  14.1,  the  number 
of  representative  points  moving  across  the  left-hand  edge  into 
the  area  per  unit  time  is 

pfidpi  (14.60) 

The  number  of  representative  points  flowing  out  of  the  area  along  the  right-hand  edge  is 
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Hence  the  net  increase  in  p in  the  infinitessimal  rectangular  element  dqidpi  due  to  flow  in  the  horizontal 
direction  is 

d 

(pqi)  dqidpi  (14.62) 

OQi 

Similarly,  the  net  gain  due  to  flow  in  the  vertical  direction  is 

d 

- (pPi)  dpidqi  (14.63) 

Op% 

Thus  the  total  increase  in  the  element  dqidpi  per  unit  time  is  therefore 

' d d 1 

- (PQi)  + (PPi)  dpidqi  (14.64) 

OQi  opi 

Assume  that  the  total  number  of  points  must  be  conserved,  then  the  total  increase  in  the  number  of 
points  inside  the  element  dqidpi  must  equal  the  net  changes  in  p on  the  infinitessimal  surface  element  per 
unit  time.  That  is 

(it ) d<lldpi  (14.65) 

Thus  summing  over  all  possible  values  of  i gives 

IV+JZ  NT  (PQi)  + 7VT  (PPi)  =°  (14.66) 


op  O , . o , . N 

7p+E  +*-(«*)  =0 


UP  I I • ®P  I . ®P  I I uPi  I ULH  I _ n 


l l r\ 

OQi  OPi 


dpt  dqt 


dpi  dq, 


(14.67) 


Inserting  Hamilton’s  canonical  equations  into  both  brackets  and  differentiating  the  last  bracket  results  in 

dp  y.  \3H_dp_  _ dH_dp]  x - I-  d2H  d2H  1 _ 
dt  ^ dp.j  dq;  dq;  dp;  + P dp.jdq;  dp;dq; 


(14.68) 


The  two  terms  in  the  last  bracket  cancel  and  thus 

dp  dH  dp 

til  ^ 2__/  /It)  / In 


dH  dp  dH  dp]  _ dp  _ 


dpt  dq.i  dq.i  dpt  \ dt 


(14.69) 


However,  this  just  equals  therefore 

I =!+ ["■"]= ° 

This  is  called  Liouville’s  theorem  which  states  that  the  rate  of  change  of  density  of  representative 
points  vanishes,  that  is,  the  density  of  points  is  a constant  in  the  Hamiltonian  phase  space  along  a specific 
trajectory.  Liouville’s  theorem  means  that  the  system  acts  like  an  incompressible  fluid  that  moves  such  as  to 
occupy  an  equal  volume  in  phase  space  at  every  instant,  even  though  the  shape  of  the  phase-space  volume 
may  change,  that  is,  the  phase-space  density  of  the  fluid  remains  constant.  Equation  (14.70)  is  another 
illustration  of  the  basic  Poisson  bracket  relation  (14.45)  and  the  usefulness  of  Poisson  brackets  in  physics. 

Liouville’s  theorem  is  crucially  important  to  statistical  mechanics  of  ensembles  where  the  exact  knowledge 
of  the  system  is  unknown,  only  statistical  averages  are  known.  An  example  is  in  focussing  of  beams  of  charged 
particles  by  beam  handling  systems.  At  a focus  of  the  beam,  the  transverse  width  in  x is  minimized,  while 
the  width  in  px  is  largest  since  the  beam  is  converging  to  the  focus,  whereas  a parallel  beam  has  maximum 
width  x and  minimum  spreading  width  px.  However,  the  product  xpx  remains  constant  throughout  the 
focussing  system.  For  a two  dimensional  beam,  this  applies  equally  for  the  y and  py  coordinates,  etc.  It  is 
obvious  that  the  final  beam  quality  for  any  beam  transport  system  is  ultimately  limited  by  the  emittance  of 
the  source  of  the  beam,  that  is,  the  initial  area  of  the  phase  space  distribution.  Note  that  Liouville’s  theorem 
only  applies  to  Hamiltonian  % — p,;  phase  space,  not  to  x — x Lagrangian  state  space.  As  a consequence, 
Hamiltonian  dynamics,  rather  than  Lagrange  dynamics,  is  used  to  discuss  ensembles  in  statistical  physics. 

Note  that  Liouville’s  theorem  is  applicable  only  for  conservative  systems,  that  is,  where  Hamilton’s 
equations  of  motion  apply.  For  dissipative  systems  the  phase  space  volume  shrinks  with  time  rather  than 
being  a constant  of  the  motion. 
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14.3  Canonical  transformations  in  Hamiltonian  mechanics 


Hamiltonian  mechanics  is  an  especially  elegant  and  powerful  way  to  derive  the  equations  of  motion  for  com- 
plicated systems.  Unfortunately,  integrating  the  equations  of  motion  to  derive  a solution  can  be  a challenge. 
Hamilton  recognized  this  difficulty,  so  he  proposed  using  generating  functions  to  make  canonical  transfor- 
mations which  transform  the  equations  into  a known  soluble  form.  Jacobi,  a contemporary  mathematician, 
recognized  the  importance  of  Hamilton’s  pioneering  developments  in  Hamiltonian  mechanics,  and  therefore 
he  developed  a sophisticated  mathematical  framework  for  exploiting  the  generating  function  formalism  in 
order  to  make  canonical  transformations  required  to  solve  Hamilton’s  equations  of  motion. 

In  the  Lagrange  formulation,  transforming  coordinates  ( Qi,qi ) to  cyclic  generalized  coordinates  (Qi,Qi), 
simplifies  finding  the  Euler-Lagrange  equations  of  motion.  For  the  Hamiltonian  formulation,  the  concept  of 
coordinate  transformations  is  extended  to  include  simultaneous  canonical  transformation  of  both  the  spatial 
coordinates  qi  and  the  conjugate  momenta  Pi  from  to  ( Qi , P,).  where  both  of  the  canonical  variables 

are  treated  equally  in  the  transformation.  Compared  to  Lagrangian  mechanics,  Hamiltonian  mechanics  has 
twice  as  many  variables  which  is  an  asset,  rather  than  a liability,  since  it  widens  the  realm  of  possible 
canonical  transformations. 

Hamiltonian  mechanics  has  the  advantage  that  generating  functions  can  be  exploited  to  make  canonical 
transformations  to  find  solutions,  which  avoids  having  to  use  direct  integration.  Canonical  transformations 
are  the  foundation  of  Hamiltonian  mechanics;  they  underlie  Hamilton- Jacobi  theory  and  action- angle  variable 
theory,  both  of  which  are  powerful  means  for  exploiting  Hamiltonian  mechanics  to  solve  problems  in  physics 
and  engineering.  The  concept  underlying  canonical  transformations  is  that,  if  the  equations  of  motion  are 
simplified  by  using  a new  set  of  generalized  variables  (Q,P),  compared  to  using  the  original  set  of  variables 
(q,  p),  then  an  advantage  has  been  gained.  The  solution,  expressed  in  terms  of  the  generalized  variables 
(Q,P),  can  be  transformed  back  to  express  the  solution  in  terms  of  the  original  coordinates,  (q,  p). 

Only  a specialized  subset  of  transformations  will  be  considered,  namely  canonical  transformations  that 
preserve  the  canonical  form  of  Hamilton’s  equations  of  motion.  That  is,  given  that  the  original  set  of  variables 
( qi,Pi ) satisfy  Hamilton’s  equations 


q = 


dH(cj,p,t) 

dp 


P = 


dH(q,p,t) 

dq 


(14.71) 


for  some  Hamiltonian  H (q,  p,  t) , then  the  transformation  to  coordinates  Qi  (qk.Pk H),Pi  ( qk > Pk , t)  is  canonical 
if,  and  only  if,  there  exists  a function  7Y(Q,P,f)  such  that  the  P and  Q are  still  governed  by  Hamilton’s 
equations.  That  is, 


Q 


dH(Q,P,t) 


-P  = 


0W(Q,P,t) 


(14.72) 


<9P  b>Q 

where  7i(Q,P ,t)  plays  the  role  of  the  Hamiltonian  for  the  new  variables.  Note  that  7Y(Q,P , t)  may  be 
very  different  from  the  old  Hamiltonian  H(q.p.t).  The  invariance  of  the  Poisson  bracket  to  canonical 
transformations,  chapter  14.2.3,  provides  a powerful  test  that  the  transformation  is  canonical. 

Hamilton’s  Principle  of  least  action,  discussed  in  chapter  13,  states  that 


SS  = 5 f L(q,  q,  t)dt  = 6 [ [p  q 


H( q,  p,  t)]  dt  = 0 


(14.73) 


Similarly,  applying  Hamilton’s  Principle  of  least  action  to  the  new  Lagrangian  £(Q,  Q,  t)  gives 


ft  2 


SS  = 5 


£(Q,  Q,  t)dt  = 5 / P • Q — H(Q,  P,  t) 


dt  = 0 


(14.74) 


The  discussion  of  gauge-invariant  Lagrangians,  chapter  13.4,  showed  that  L and  £ can  be  related  by  the 
total  time  derivative  of  a generating  function  F where 


dt 


(14.75) 


The  generating  function  F can  be  any  well-behaved  function  with  continuous  second  derivatives  of  both  the 
old  and  new  canonical  variables  p,  q,  P,Q  and  t.  Thus  the  integrands  of  (14.73)  and  (14.74)  are  related  by 


P • q-  H(q,p,t)  = A P Q - H{Q,P,t) 


dF 

dt 


(14.76) 
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where  A is  a possible  scale  transformation.  A scale  transformation,  such  as  changing  units,  is  trivial,  and  will 
be  assumed  to  be  absorbed  into  the  coordinates,  making  A = 1.  Assuming  that  A ^ 1 is  called  an  extended 
canonical  transformation. 

14.3.1  Generating  functions 

The  generating  function  F has  to  be  chosen  such  that  the  transformation  from  the  initial  variables  (q,  p) 
to  the  final  variables  (Q,P)  is  a canonical  transformation.  The  chosen  generating  function  contributes  to 
(14.76)  only  if  it  is  a function  of  the  old  plus  new  variables.  The  four  possible  types  of  generating  functions 
of  the  first  kind,  are  F\  (q,  Q,  t).  F2(q,P,t),  7*3 (p,  Q,  t) , and  ^(p,  P,t).  These  four  generating  functions 
lead  to  relatively  simple  canonical  transformations,  are  shown  below. 


Type  1:  F = F1(q,  Q,t)  : 

The  total  time  derivative  of  the  generating  function  F = F-t  (q,  Q ,t)  is  given  by 


dF( q,  Q,t) 
dt 


dF\  (q,  Q.t)  ^ | 0F1  (q,  Q.t)  A 


<9q 


<9Q 


dF1{q,Q,t) 

dt 


(14.77) 


Insert  equation  (14.77)  into  equation  (14.76),  and  assume  that  the  trivial  scale  factor  A = 1,  then 


P 


dFi(q,Q4) 


<9q 


■ q - H(q,p,t)  = 


P + 


dFt{  q,Q,i) 


<9Q 


Q-W(Q,P,i) 


dF[  (q.  Q,t) 


dt 


Assume  that  the  generating  function  F\  determines  the  canonical  variables  p and  P to  be 


0Fi(q,Q,i) 


P = 


9Fi(q,  Q,t) 


" 9q  d»Q 

then  the  terms  in  each  square  bracket  cancel,  leading  to  the  required  canonical  transformation 

dF1(q,Q,t) 


77(Q,P,t)  = H(q,p,t)  + 


dt 


(14.78) 


(14.79) 


Type  2:  F = F2( q,  P,t)  - Q ■ P : 

The  total  time  derivative  of  the  generating  function  F = F2( q,  P,t)~ Q • P is  given  by 


dF 

dt 


^(q’P^.q+^y^.P-P.Q-P.q 


dF2{q,P,t) 

dt 


dq  ^ dP 

Insert  this  into  equation  (14.76) , and  assume  that  the  trivial  scale  factor  A = 1,  then 

'0F2(q,P,t) 


P~  ^dq^  ) q g(q,p,f)=P  Q P Q+ 


ap 


-Q 


P-H(Q,P,t)- 


Assume  that  the  generating  function  F2  determines  the  canonical  variables  p and  Q to  be 


P = 


dF2(q,P,t) 


Q = 


dF2(q,P,t) 


dq  ‘ dP 

then  the  terms  in  brackets  cancel,  leading  to  the  required  transformation 

dF2(q,P,t) 


H{Q,P,t)  = H{  q,p,t)  + 


dt 


(14.80) 


dF2(g,Pd) 

dt 


(14.81) 


(14.82) 
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Type  3:  F = F3(p,  Q ,t)  + q p : 

The  total  time  derivative  of  the  generating  function  F = F3(p,  Q,  t)  + q • p is  given  by 


dF 

dt 


dF3(p,Q,t) 

dp 


dF3(p,Q,t) 

dQ 


Q + q p +q  p 


dF3( p,  Q,t) 
dt 


Insert  this  into  equation  (14.76) , and  assume  that  the  trivial  scale  factor  A = 1,  then 


q+ 


&F3(p,Q,f) 


<9p 


p+ 


dF3(  p,Q,t) 


d Q 


•Q-W(Q,P,i) 


dF3(  p,Q,t) 


at 


Assume  that  the  generating  function  F3  determines  the  canonical  variables  q and  P to  be 

dF3(p,Q,t)  p=  dF3(  p,Q,i) 


q = 


aP  b>Q 

then  the  terms  in  brackets  cancel,  leading  to  the  required  transformation 

dF3(  p,Q,t) 


W(Q,P,t)=F(q,p,t) 


dt 


(14.83) 


(14.84) 


(14.85) 


Type  4:  F = F4(p.  P ,t)  + q • p - Q P : 

The  total  time  derivative  of  the  generating  function  F = F4  (p,  P,f)  + q p Q P is  given  by 

dF4(p,P,t) 
dt 


dF 

dt 


dF4(P,p,t) 

dp 


dF4(p,P,t ) 

dP 


qp  +q-p  — Q-P  — Q-P 


(14.86) 


Insert  this  into  equation  (14.76) , and  assume  that  the  trivial  scale  factor  A = 1,  then 


q+ 


dF4(p,P,t) 


dp 


p - H(q,  p,  t)  = 


dF4{p,P,t) 

dP 


Q 


•P  — 7d(Q,  P,  t)  + 


3^4  (P,P,*) 
dt 


Assume  that  the  generating  function  F4  determines  the  canonical  variables  q and  Q to  be 


q = 


dF4(  p,P,i) 


Q = 


dFi(p,P,t) 


dp  1 dP 

then  the  terms  in  brackets  cancel,  leading  to  the  required  transformation 

dF4(p,P,t) 


W(Q,P,i)  = H(q,p,t)  + 


dt 


(14.87) 


(14.88) 


Note  that  the  last  three  generating  functions  require  the  inclusion  of  additional  bilinear  products  of 
q,p,Q,P  in  order  for  the  terms  to  cancel  to  give  the  required  result.  The  addition  of  the  bilinear  terms, 
ensures  that  the  resultant  generating  function  F is  the  same  using  any  of  the  four  generating  functions 
F\,  F-2,  F3,  F4.  Frequently  the  F2(q,P,t)  generating  function  is  the  most  convenient.  The  four  possible 
generating  functions  of  the  first  kind,  given  above,  are  related  by  Legendre  transformations.  A canonical 
transformation  does  not  have  to  conform  to  only  one  of  the  four  generating  functions  F/;;  for  all  the  degrees 
of  freedom,  they  can  be  a mixture  of  different  flavors  for  the  different  degrees  of  freedom.  The  properties  of 
the  generating  functions  are  summarized  in  table  14.1. 


Table  14.1  Canonical  transformation  generating  functions 


Generating  function 

Generating  function  derivatives 

Trivial  special  examples 

F = F|  (q,  Q,  t) 

n-  = ILL l 

Pl  dm 

p — 'dt  i 
r%  dQ i 

^1  — QiQi 

Qi  = Pi 

Pi  = -Qi 

F = F2(q,P,t)-Q-P 

F-2  = QiPi 

Qi  — Qi 

Pi  = Pi 

F — F3  (p,  Q,  f)  4 q ■ p 

_ _dt 3 
~ dvi 

p _ dt'3 

~ dQi 

^3  — PiQi 

Qi  — Qi 

Pi  = ~Pi 

F = F4(p,  P,t)  + q ■ p - Q P 

n - -dh 
~ dvi 

n — sLLl 

II 

3a 

Qi  — Pi 

Pi  = -qi 
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The  partial  derivatives  of  the  generating  functions  F,  determine  the  corresponding  conjugate  variables 
not  explicitly  included  in  the  generating  function  F;.  Note  that,  for  the  first  trivial  example  F\  = qt.Qi,  the 
old  momenta  become  the  new  coordinates,  Qi  = Pi,  and  vice  versa,  Pi  = — (p . This  illustrates  that  it  is 
better  to  name  them  "conjugate  variables"  rather  than  "momenta"  and  "coordinates". 

In  summary,  Jacobi  has  developed  a mathematical  framework  for  finding  the  generating  function  F 
required  to  make  a canonical  transformation  to  a new  Hamiltonian  7i(Q,P,t),  that  has  a known  solution. 
That  is, 

W(Q,P,t)  = ff(q,p,t)  + ^ (14.89) 

When  7d(Q,  P,  t)  is  a constant,  then  a solution  has  been  obtained.  The  inverse  transformation  for  this  solution 
Q(t),  P(t)  — > q(t),  p(t)  now  can  be  used  to  express  the  final  solution  in  terms  of  the  original  variables  of  the 
system. 

Note  the  special  case  when  H( Q,  P,  t)  = 0,  then  equation  14.89  has  been  reduced  to  the  Hamilton-Jacobi 
relation  (14.12) 

8S 

H(q,p,t)  + — = 0 (14.12) 

In  this  case,  the  generating  function  F determines  the  action  functional  S required  to  solve  the  Hamilton- 
Jacobi  equation  (14.12).  Since  equation  (14.89)  has  transformed  the  Hamiltonian  H(q,p,t)  — > 7d(Q,P,f), 
for  which  Q,  P,  t)  = 0,  then  the  solution  Q (£),  P(t)  for  the  Hamiltonian  7i(Q,  P,  t)  = 0 is  obtained  easily. 
This  approach  underlies  Hamilton-Jacobi  theory  presented  in  chapter  14.4. 


14.3.2  Applications  of  canonical  transformations 

The  canonical  transformation  procedure  may  appear  unnecessarily  complicated  for  solving  the  examples 
given  in  this  book,  but  it  is  essential  for  solving  the  complicated  systems  that  occur  in  nature.  For  example, 
canonical  transformations  can  be  used  to  transform  time-dependent,  (non-autonomous)  Hamiltonians  to 
time-independent,  (autonomous)  Hamiltonians  for  which  the  solutions  are  known.  Example  14.19  describes 
such  a system.  Canonical  transformations  provide  a remarkably  powerful  approach  for  solving  the  equations 
of  motion  in  Hamiltonian  mechanics,  especially  when  using  the  Hamilton-Jacobi  approach  discussed  in 
chapter  14.4. 

14.7  Example:  The  identity  canonical  transformation 

The  identity  transformation  F2(q,  P)  = q P satisfies  (14.89)  if  the  following  relations  are  satisfied 
Pi  ~ IFF  = Qi  = ffSr  = Qi,  H=H.  Note  that  the  new  and  old  coordinates  are  identical,  hence  F2  = qiPi 
generates  the  identity  transformation  qi  = Qi  ,Pi  = P*. 

14.8  Example:  The  point  canonical  transformation 

Consider  the  point  transformation  P2(q-P)  = /(q,t)-P  where  /(q,f)  is  some  function  of  q.  This 
transformation  satisfies  (14.89)  if  the  following  relations  are  satisfied  Qi  = Off  = ffiqi),  Pi  = Off  = , 

H=H.  Point  transformations  correspond  to  point-to-point  transformations  of  coordinates. 

14.9  Example:  The  exchange  canonical  transformation 

The  identity  transformation  Pi(q,  Q)  = q Q satisfies  (14.89)  if  the  following  relations  are  satisfied 
Pi  = = Qi,  Pi  = — = — qt , H=H  That  is,  the  coordinates  and  momenta  have  been  interchanged. 


14.10  Example:  Infinites simal  point  canonical  transformation 

Consider  an  infinitessimal  point  canonical  transformation,  that  is  infinitesimally  close  to  a point  identity. 

P2(q  • P ,t)  = q • P+eG(q,  P ,t) 
satisfies  (14.89)  if  the  following  relations  are  satisfied 


Qi 


8F2  , dG( q,P,f) 

an  Qi  i C an 
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Thus  the  infinitessimal  changes 
%(q,p,t) 
6pi(ci,p,t) 

Thus  G(q,  P,i)  is  the  generator 


„ 9F2  p dG{  g,P,t) 

oqi  dqi 

in  qi  and  pi  are  given  by 

„ 8G{ q,P,i)  9G(q,P,t)  in,2, 

= = e =e  +0(£) 

„ aG(q,P,t)  aG(q,P,t),^2 

= Pi-Pi  = -e = -e 5 hO(e 

o'  q%  dp, 

of  the  infinitessimal  canonical  transformation. 


14.11  Example:  1-D  harmonic  oscillator  via  a canonical  transformation 

The  classic  one- dimensional  harmonic  oscillator  provides  an  example  of  the  use  of  canonical  transforma- 
tions. Consider  the  Hamiltonian  where  u>2  = — then 


H=L  + '~T  = ^ + mVU 

This  form  of  the  Hamiltonian  is  a sum  of  two  squares  suggesting  a canonical  transformation  for  which 

H is  cyclic  in  a new  coordinate.  A guess  for  a canonical  transformation  is  of  the  form  p = muq  cot  Q which 

2 

is  of  the  fi(q,  Q)  type  where  F\  equals  F\(q,Q)  = mif:  cot  Q.  Using  (14.78)  gives 


kq 2 


1 


P = 


P = 


Solving  for  the  coordinates  (p,  q)  yields 


dF1(q,Q) 

dqi 

dF\ foQ) 
BQ 


= muiq  cot  Q 

m uiq 2 
2 sin2  Q 


q 

P 


2P  ■ n 
smQ 

moo 


= VfrmvP  cos  Q 


(a) 

(b) 


Inserting  these  into  H gives 


hi  =ujP( cos2  Q + sin2  Q)  = ujP 


which  implies  that  Q is  a cyclic  coordinate. 

The  Hamiltonian  is  conservative,  since  it  does  not  explicitly  depend  on  time,  and  it  equals  the  total  energy 
since  the  transformation  to  generalized  coordinates  is  time  independent.  Thus 


hi  =E  = ojP 


Since 


then 


■ dH 

Q = ap=a’ 


Q — ovt  F f> 

Substituting  Q into  (a)  gives  the  well  known  solution  of  the  one- dimensional  harmonic  oscillator 


q = 


2 E 


mu) 


sin(wt  + <j>) 
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14.4  Hamilton-Jacobi  theory 

Hamilton  used  the  Principle  of  Least  Action  to  derive  the  Hamilton-Jacobi  relation  (chapter  14.3) 

rr/  \ 

H(n,p,t)  + — = 0 


(14.12) 


where  q,  p refer  to  the  1 < i < n variables  qi,Pi  and  S{qj{t\  ).t-\ , qjfe),  tz)  is  the  action  functional.  Inte- 
gration of  this  first-order  partial  differential  equation  is  non  trivial  which  is  a major  handicap  for  practical 
exploitation  of  the  Hamilton-Jacobi  equation.  This  stimulated  Jacobi  to  develop  the  mathematical  frame- 
work for  canonical  transformation  that  are  required  to  solve  the  Hamilton-Jacobi  equation.  Jacobi’s  approach 
is  to  exploit  generating  functions  for  making  a canonical  transformation  to  a new  Hamiltonian  7i(Q,P,t) 
that  equals  zero. 

a a 

P(Q,P,t)  = P(q,p,f)  + — = 0 (14.90) 

The  generating  function  for  solving  the  Hamilton-Jacobi  equation  then  equals  the  action  functional  S. 

The  Hamilton-Jacobi  theory  is  based  on  selecting  a canonical  transformation  to  new  coordinates  ( Q , P,  t) 
all  of  which  are  either  constant,  or  the  Qi  are  cyclic,  which  implies  that  the  corresponding  momenta  P,  are 
constants.  In  either  case,  a solution  to  the  equations  of  motion  is  obtained.  A remarkable  feature  of  Hamilton- 
Jacobi  theory  is  that  the  canonical  transformation  is  completely  characterized  by  a single  generating  function, 
S.  The  canonical  equations  likewise  are  characterized  by  a single  Hamiltonian  function,  H.  Moreover,  the 
generating  function  S,  and  Hamiltonian  function  H,  are  linked  together  by  equation  14.12.  The  underlying 
goal  of  Hamilton-Jacobi  theory  is  to  transform  the  Hamiltonian  to  a known  form  such  that  the  canonical 
equations  become  directly  integrable.  Since  this  transformation  depends  on  a single  scalar  function,  the 
problem  is  reduced  to  solving  a single  partial  differential  equation. 


14.4.1  Time-dependent  Hamiltonian 

Jacobi’s  complete  integral  S(qi,Pi,t ) 

The  principle  underlying  Jacobi’s  approach  to  Hamilton-Jacobi  theory  is  to  provide  a recipe  for  finding 
the  generating  function  P = S needed  to  transform  the  Hamiltonian  P(q,  p,  t)  to  the  new  Hamiltonian 
7t(Q,P,f)  using  equation  14.90.  When  the  derivatives  of  the  transformed  Hamiltonian  7J(Q,P,f)  are  zero, 
then  the  equations  of  motion  become 


<»■  - SH 


Pi  = 


(14.91) 

(14.92) 


and  thus  Qi  and  P,  are  constants  of  motion.  The  new  Hamiltonian  TL  must  be  related  to  the  original 
Hamiltonian  H by  a canonical  transformation  for  which 


W(Q,P,f)  =H(  q,p,t) 


(14.93) 


Equations  14.91  and  14.92  are  automatically  satisfied  if  the  new  Hamiltonian  TL  = 0 since  then  equation 
14.93  gives  that  the  generating  function  S satisfies  equation  14.90. 

Any  of  the  four  types  of  generating  function  can  be  used.  Jacobi  chose  the  type  2 generating  function 
as  being  the  most  useful  for  most  practical  cases,  that  is,  S(qi,Pi,t ) which  is  called  Jacobi’s  complete 
integral. 

For  generating  functions  Pi  and  P2  the  generalized  momenta  are  derived  from  the  action  by  the  derivative 


Use  this  generalized  momentum  to  replace  Pi  in  the  Hamiltonian  P,  given  in  equation  (14.93) , leads  to  the 
Hamilton-Jacobi  equation  expressed  in  terms  of  the  action  S. 

TT,  dS  dS  , 8S  n 

H(qi,...qn-,— ;t)  + — -0 


(14.94) 
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The  Hamilton- Jacobi  equation,  (14.94),  can  be  written  more  compactly  using  tensors  q and  'VS  to  designate 
(qi,..qn)  and  respectively.  That  is 


H(^VS,t)  + — = 0 (14.95) 

Equation  (14.95)  is  a first-order  partial  differential  equation  in  n + 1 variables  which  are  the  old  spatial 
coordinates  qi  plus  time  t.  The  new  momenta  P,;  have  not  been  specified  except  that  they  are  constants 
since  71=0. 

Assume  the  existence  of  a solution  of  (14.95)  of  the  form  S(qi,Pi,t)  = S(q\,  ..qn-,d\,  ..an+x\t)  where 
the  generalized  momenta  P,  = d\,d2,  ....a.  plus  t are  the  n + 1 independent  constants  of  integration  in  the 
transformed  frame.  One  constant  of  integration  is  irrelevant  to  the  solution  since  only  partial  derivatives  of 
S(qi,Pi,t ) with  respect  to  qi  and  t are  involved.  Thus,  if  S'  is  a solution  of  the  first-order  partial  differential 
equation,  then  so  is  S + a where  a is  a constant.  Thus  it  can  be  assumed  that  one  of  the  n + 1 constants  of 
integration  is  just  an  additive  constant  which  can  be  ignored  leading  effectively  to  a solution 

S(qi,Pi,t)  = S(qi, qn\a  i, an\t)  (14.96) 


where  none  of  the  n independent  constants  are  solely  additive.  Such  generating  function  solutions  are  called 
complete  solutions  of  the  first-order  partial  differential  equations  since  all  constants  of  integration  are  known. 

It  is  possible  to  assume  that  the  n generalized  momenta,  Pi  are  constants  cq,  where  the  a*  are  the 
constants.  This  allows  the  generalized  momentum  to  be  written  as 


Pi  = 


dS( q,  a , t ) 
dqi 


(14.97) 


Similarly,  Hamilton’s  equations  of  motion  give  the  conjugate  coordinate  Q = (3,  where  /3j  are  constants.  That 
is 


Qi  =Pi  = 


dS( q,  a,  t ) 

ddi 


(14.98) 


The  above  procedure  has  determined  the  complete  set  of  2 n constants  (Q  = /3,  P = a).  It  is  possible  to 
invert  the  canonical  transformation  to  express  the  above  solution,  which  is  expressed  in  terms  of  Qi  = 
and  P;  = ai,  back  to  the  original  coordinates,  that  is,  qj  = qj(d,  /3,  t)  and  momenta  Pj  = Pj(d,  (3,  t)  which  is 
the  required  solution. 


Hamilton’s  principle  function  Sn(sii,  f;  q0£0) 

Hamilton’s  approach  to  solving  the  Hamilton-Jacobi  equation  (14.95)  is  to  seek  a canonical  transformation 
from  variables  (p,q)  at  time  t,  to  a new  set  of  constant  quantities,  which  may  be  the  initial  values  (qo,P0) 
at  time  t = 0.  Hamilton’s  principle  function  SniPi-t,;  q„t0)  is  the  generating  function  for  this  canonical 
transformation  from  the  variables  (q,  p)  at  time  t to  the  initial  variables  (qo,Po)  at  time  to-  Hamilton’s 
principle  function  <5#  (</,,  f;  q0t0)  is  directly  related  to  Jacobi’s  complete  integral  S(qi,Pi,t ). 

Note  that  Sh  is  the  generating  function  of  a canonical  transformation  from  the  present  time  (q,  p,  t) 
variables  to  the  initial  (qo,  p0,  to),  whereas  Jacobi’s  S is  the  generating  function  of  a canonical  transformation 
from  the  present  (q,  p,  f)  variables  to  the  constant  variables  (Q  = /3,  P = a).  For  the  Hamilton  approach, 
the  canonical  transformation  can  be  accomplished  in  two  steps  using  S by  first  transforming  from  (q,  p,  t) 
at  time  t,  to  (f3,  a),  then  transforming  from  (/3,  a)  to  (qo,  p0,  to)  • That  is,  this  two-step  process  corresponds 
to 

Sh( q.  L q 0t0)  = S{ q,  a , t)  - ^(qo,  a,  t0)  (14.99) 

Hamilton’s  principle  function  Sh (q,  t:  q 0t0)  is  related  to  Jacobi’s  complete  integral  S( q,  a,  t ),  and  it  will  not 
be  discussed  further  in  this  book. 
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14.4.2  Time-independent  Hamiltonian 


Frequently  the  Hamiltonian  does  not  explicitly  depend  on  time.  For  the  standard  Lagrangian  with  time- 
independent  constraints  and  transformation,  then  H (q,  p,t)  = E which  is  the  total  energy.  For  this  case, 
the  Hamilton- Jacobi  equation  simplifies  to  give 


dS  TTf  . ri.  . 
— = -H{  q,p,t)  = -E(a) 


(14.100) 


The  integration  of  the  time  dependence  is  trivial,  and  thus  the  action  integral  for  a time-independent  Hamil- 
tonian equals 

S(q,  a,t)  = W (q,  a)  — E (a)  t (14.101) 

That  is,  the  action  integral  has  separated  into  a time  independent  term  W (q,  a)  which  is  called  Hamilton’s 
characteristic  function  plus  a time-dependent  term  —E(a)t.  Thus  using  equations  14.97,14.101  gives 
that  the  generalized  momentum  is 

dW{ q,  a) 


Pi  = 


dqi 


(14.102) 


The  physical  significance  of  Hamilton’s  characteristic  function  W (q,  a)  can  be  understood  by  taking  the 
total  time  derivative 

dW  v-  dW( q,  a)  . x - 

= L— * = 


dt 


Taking  the  time  integral  then  gives 


W( 


(q,  a)  = j = j ^2 Pidqi 


(14.103) 


Note  that  this  equals  the  abbreviated  action  described  in  chapter  13.2.3,  that  is  W( q,  a)  = 5o(q,  a). 
Inserting  the  action  S (q,  a)  into  the  Hamilton- Jacobi  equation  (14.12)  gives 


(14.104) 


This  is  called  the  time-independent  Hamilton- Jacobi  equation.  Usually  it  is  convenient  to  have  E 
equal  the  total  energy.  However,  sometimes  it  is  more  convenient  to  exclude  the  kth  energy  E{otk)  in  the 
set,  in  which  case  E = E(ai,a2,  ...oik- 1);  the  Routhian  exploits  this  feature.. 

The  equations  of  the  canonical  transformation  expressed  in  terms  of  W (q,  a)  are 


Pi  = 


dW(q,  a) 


dqi 


0 + dE(a)t  _ dW(q,  a) 


don 


da i 


(14.105) 


These  equations  show  that  Hamilton’s  characteristic  function  W (q,  a)  is  itself  the  generating  function  of  a 
time-independent  canonical  transformation  from  the  old  variables  ( q , p)  to  a set  of  new  variables 


OCX-i 


Pi 


OLi 


(14.106) 


Table  14.2  summarizes  the  time-dependent  and  time-independent  forms  of  the  Hamilton-Jacobi  equation. 


Table  14.2;  Hamilton-Jacobi  formulations 


Hamiltonian 

Time  dependent  H(q,p,t) 

Time  independent  H(q,p) 

Transformed  Hamiltonian 

H=  0 

TL  is  cyclic 

Canonical  transformed  variables 

All  QiPi  are  constants  of  motion 

All  Pi  are  constants  of  motion 

Transformed  equations  of  motion 

Qi  = §p~  = 0,  therefore  Q,  = 

Pi  = — = 0,  therefore  Pi  = ai 

Qi  — §p~  — Vi,  therefore  Qi  = Vit  + ^ 
Pi  = = 0,  therefore  Pi  = ai 

Generating  function 

Jacobi’s  complete  integral  S{ q,  P,  t) 

Characteristic  Function  W(q,  P) 

Hamilton-Jacobi  equation 

H{q1,  ...qn,  Qqi  , dqn)  ^ 

Transformation  equations 

Pi=Wi 

Qi=&=Pi 

v—— 

Pi~  dqi 

= Vit  + Pi 
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14.4.3  Separation  of  variables 

Exploitation  of  the  Hamilton- Jacobi  theory  requires  finding  a suitable  action  function  S.  When  the  Hamil- 
tonian is  time  independent,  then  equation  14.101  shows  that  the  time  dependence  of  the  action  integral 
separates  out  from  the  dependence  on  the  spatial  variables.  For  many  systems,  the  Hamilton’s  characteristic 
function  W(q,  P)  separates  into  a simple  sum  of  terms  each  of  which  is  a function  of  a single  variable.  That 
is, 

W(q,a)  = W1(q1)  + W2(q2)  + Wn(qn)  (14.107) 

where  each  function  in  the  summation  on  the  right  depends  only  on  a single  variable.  Then  equation  (14.100) 
reduces  to 

dW  dW 

H{qn...qn-°^,...^)  = E (14.108) 

dqi  dqn 

where  E is  the  constant  denoting  the  total  energy. 

Hamilton’s  characteristic  function  W(q, P)  can  be  used  with  equations  (14.101),  (14.102),  (14.91), 
(14.92),  and  (14.93)  to  derive 


Pi 


Qi 

U 


dW(q,  ci) 


dqi 


8H 

dPt 


0 


_ dW(q,  a) 
^ dPi 

p.=  ™=  0 

4 dQl 


H + ^-  = H - E = 0 
at 


(14.109) 

(14.110) 

(14.111) 


which  has  reduced  the  problem  to  a simple  sum  of  one-dinrensional  first-order  differential  equations. 

If  the  ith  variable  is  cyclic,  then  the  Hamiltonian  is  not  a function  of  qi  and  the  ith  term  in  Hamilton’s 
characteristic  function  equals  Wi  = aiqi  which  separates  out  from  the  summation  in  equation  14.107.  That 
is,  all  cyclic  variables  can  be  factored  out  of  W (q,  a)  which  greatly  simplifies  solution  of  the  Hamilton- Jacobi 
equation.  As  a consequence,  the  ability  of  the  Hamilton- Jacobi  method  to  make  a canonical  transformation  to 
separate  the  system  into  many  cyclic  or  independent  variables,  which  can  be  solved  trivially,  is  a remarkably 
powerful  way  for  solving  the  equations  of  motion  in  Hamiltonian  mechanics. 


14.12  Example:  Free  particle 

Consider  the  motion  of  a free  particle  of  mass  m in  a force-free  region.  Then  equation  14.93  reduces  to 


TT,  dS  dS  , dS  n 

H{qi,  ■■An',  -r. — , -r — ; t)  + -t—  — 0 

OQl  OQn  dt 

Since  no  forces  act,  and  the  momentum  p = VS,  thus  the  Hamilton- Jacobi  equation  reduces  to 


1 „2  r,  dS  n 
2m V 5+^~° 

The  Hamiltonian  is  time  independent,  thus  equation  14.101  applies 


{A) 


S( q,  t)  = W{q,  a)  — E(a)t 

Since  the  Hamiltonian  does  not  explicitly  depend  on  the  coordinates  ( x,y,z ),  then  the  coordinates  are  cyclic 
and  separation  of  the  variables,  14.107,  gives  that  the  action 


S = a.  ■ r — Et 


For  B to  be  a solution  of  A requires  that 


E=  — o? 
2 m 


(B) 

(C) 


S = a ■ r — 


2 m 


a2t 


(■ D ) 


Therefore 
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Since 


• dS  a 

Q = — = r 1 

da  m 


the  equation  of  motion  and  the  conjugate  momentum  are  given  by 

p =VS=a 

Thus  the  Hamilton- Jacobi  relation  has  given  both  the  equation  of  motion  and  the  linear  momentum  p. 


• a 

r = Q H 1 

TO 


14.13  Example:  Point  particle  in  a uniform  gravitational  field 

The  Hamiltonian  is 

H = 2 m&x  + pv  + p2^  + mgz 

Since  the  system  is  conservative,  then  the  Hamilton- Jacobi  equation  can  be  written  in  terms  of  Hamilton’s 
characteristic  function  W 


E : 


1 

2 TO 


(dW_\2  /dWA2  (9W 
\ dx  J \ dy  J \ dz 


+ mgz 


Assuming  that  the  variables  can  be  separated  W = X(x ) + Y(y)  + Z(z)  leads  to 

dX(x) 

Px  o — ®x 

OX 

dY{y)  _ 

Py  — 


Pz  = 


dy 

dZ(z 


dz 


- = J 2 m(E  — mgz)  — a x — a y 


Thus  by  integration  the  total  W equals 

W=  axdx  + / Uydy  + / I J2m{E  — mgz ) — a^.  — a^j  dz 
Jx0  J y0  J z0  ' V 

Therefore  using  (14.106)  gives 

/3z  = t-t0  = 


mdz 


'zo  j2m{E  — mgz)  — ax  — 

n , , fz  axdz 

px  = constant  = (x  — xo)  — / — ~r 

Jzo  v: 

/3  = constant  = (y  - y0)  - / — F 


2 m(E  — mgz ) — ax  — a y 
aydz 


l2m(E  — mgz)  — ax  — ay 
If  xq,  yo,  zq  is  the  position  of  the  particle  at  time  t = to  then  (3X  = (3y  = 0,  and  from  (14.106) 


x — Xo  = 

y-y  o = 

Z-  Zo  = 


j2m(E  - mgz)  -a2x-a2y 


nn 


(t  - 10)  - ^ g(t  - 10)- 


This  corresponds  to  a parabola  as  shoidd  be  expected  for  this  trivial  example. 
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14.14  Example:  One- dimensional  harmonic  oscillator 

As  discussed  in  example  14.11  the  Hamiltonian  for  the  one- dimensional  harmonic  oscillator  can  be  written 


V)  = E 


assuming  it  is  conservative  and  where  u = J E . 

Hamilton’s  characteristic  function  W can  be  used  where 


S{q,E,t ) =W(q,E)~Et 


Inserting  the  generalized  momentum  pi  into  the  Hamiltonian  gives 


_L  f \dW 

2m  l dq 


+ m2uj2q-  = E 


Integration  of  this  equation  gives 


W = v2 mE  / dq\  1 


mu>2q2 


That  is 


Note  that 


This  can  be  integrated  to  give 


That  is 


S = V2mE  [ dq  j 1 - - Et 


dS(q,E,t ) 1 2m 


/1  mwV 

1 2 E 


t = — arcsin  q 


sinw  (t  — to) 


This  is  the  familiar  solution  of  the  undamped  harmonic  oscillator. 

14.15  Example:  The  central  force  problem 

The  problem  of  a particle  acted  upon  by  a central  force  occurs  frequently  in  physics.  Consider  the  mass  m 
acted  upon  by  a time-independent  central  potential  energy  U(r).  The  Hamiltonian  is  time  independent  and 
can  be  written  in  spherical  coordinates  as 


H Pr  + ~oP0  + 


2 VO  1 0-2  o1 

rz  sm  0 


+ U(r)  = E 


The  time-independent  Hamilton- Jacobi  equation  is  conservative,  thus 


1 \fc W\2  1 fdw\2 


2 m V dr  ) r2  \ 86 


1 fdW\ 

ssiU  +u(r>=E 


Try  a separable  solution  for  Hamilton’s  characteristic  function  W of  the  form 


W = R(r)  + 0(0)  + $(0) 
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The  Hamilton- Jacobi  equation  then  becomes 


1 

2m 


9R 


dr 


1 /90 


+ 


89 


1 


r2  sin2  9 \9<f> 


9$ 


+ U(r)  = E 


This  can  be  rearranged  into  the  form 


2 mr2  sin2  9 < — 

) 2m 


8R 


dr 


1 / 90 


) I O ( 


r2  \ 96 


+ u'^  + EA-\% 


The  left-hand  side  is  independent  of  whereas  the  right-hand  side  is  independent  of  r and  9.  Both  sides 
must  equal  a constant  which  is  set  to  equal  —Li,  that  is 


1 

2m 


9R 

dr 


1_  ( 90 
r2  V 99 


U(r) 


Li 


2 mr2  sin2  9 


= E 


The  equation  in  r and  9 can  be  rearranged  in  the  form 

2 

i i / n h \ 

2 mr 


1 . 

(9R\ 

2m 

K dr  ) 

90 

~99 


Li 


sin 


The  left-hand  side  is  independent  of  9 and  the  right-hand  side  is  independent  of  r so  both  must  equal  a 
constant  which  is  set  to  be  —L2 


1 (3R  1 TT,  \ 

2m  ( 9r  ) +U^  + 


L2 

2 mr2 


= E 


90 

~99 


2 Li  r2 

+ — n?—  = L2 


sin2  9 


The  variables  now  are  completely  separated  and,  by  rearrangement  plus  integration,  one  obtains 


R(r)  = V2m  j \j E —U{r) 


L 2 

2 mr2 


dr 


e(e)  - .1 
$(0)  = L, 


T2 

L2 §-d9 

sin2  9 


Substituting  these  into  W = R(r)  + 0(9)  + $(</>)  gives 


W = Vzmj  Je-vu-TA*-  + / fLAAm  + LA 

The  Hamilton’s  characteristic  function  W is  the  generating  function  from  coordinates  (r,  9,  (t>,pr,Pe,P<j>) 
to  new  coordinates,  which  are  cyclic,  and  new  momenta  that  are  constant  and  taken  to  be  the  separation 
constants  E,L,LZ. 


Pr 

Pe 

Pet 


9W 

~W 

9W 

9(j) 


Lz 


L2 

2 mr2 
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Similarly,  using  (14.109)  gives  the  new  coordinates  E,L,LZ 

dr 


ft  E ^ Q 77>  \l  O 


<9VF 

dE 


Pl 

Plz 


dW 

~dL 

dW 

dLl 


= \/2  TO 


\/E-U(r)~^ 

dr 

dd 


(-zIy) 

V 2rnr2  / 


+ 


Id*? 


yZ2" 


Ll 


L2  — 


l? 


I7? 


\ 2mr'2  J 


These  equations  lead  to  the  elliptical,  parabolic,  or  hyperbolic  orbits  discussed  in  chapter  9. 


14.16  Example:  Linearly- damped,  one- dimensional,  harmonic  oscillator 


A canonical  treatment  of  the  linearly-damped  harmonic  oscillator  provides  an  example  that  combines  use 
of  non-standard  Lagrangian  and  Hamiltonians.  A canonical  transformation  to  an  autonomous  system,  and 
use  of  Hamilton- Jacobi  theory  to  solve  this  transformed  system.  It  shows  that  Hamilton- Jacobi  theory  can  be 
used  to  determine  directly  the  solutions  for  the  linearly- damped  harmonic  oscillator. 

Non-standard  Hamiltonian: 

In  chapter  3.5,  the  equation  of  motion  for  the  linearly- damped,  one-dimensional,  harmonic  oscillator  was 
given  to  be 

TTl 

— [q  + Tq  + ulq]=0  (a) 

Example  13.2  showed  that  three  non-standard  Lagrangians  give  equation  of  motion  a when  used  with  the 
standard  Euler- Lagrange  variational  equations.  One  of  these  was  the  Bateman[Bat31]  time- dependent  La- 
grangian 

L2(q,  q,  t)  = ™ert  [ q 2 - u20q2}  ( b ) 

This  Lagrangian  gave  the  generalized  momentum  to  be 


0L2 

p = = mqe 

aq 


r t 


(c) 


which  was  used  with  equation  14.3  to  derive  the  Hamiltonian 


p2  i ^ 

H2(q,P,t ) =pq~  L2(q,q,t)  = e~rt^  + -mu20q2ert 


( d ) 


Note  that  both  the  Lagrangian  and  Hamiltonian  are  explicitly  time  dependent  and  thus  they  are  not 
conserved  quantities.  This  is  as  expected  for  this  dissipative  system. 

Hamilton-Jacobi  theory: 

The  form  of  the  non- autonomous  Hamiltonian  ( d ) suggests  use  of  the  generating  function  for  a canonical 
transformation  to  an  autonomous  Hamiltonian,  for  which  H is  a constant  of  motion. 


S(q,  P,  t)  = F2(q,  P,  t ) = qPe t = QP 


(d) 


Then  the  canonical  transformation  gives 


V 

Q 


dS_ 

dq 


Pe~ 


dS 

DP 


= qe 


(e) 


Insert  this  canonical  transformation  into  the  above  Hamiltonian  leads 
is  autonomous. 


H{Q,P,t)=H2(q,p,t)  + 


8F2 

dt 


P 2 
2m 


to  the  transformed  Hamiltonian  that 


mu  § 
2 


Q2 


(/) 
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That  is,  the  transformed  Hamiltonian  H(Q,P,t)  is  not  explicitly  time  dependent,  and  thus  is  conserved. 
Expressed  in  the  original  canonical  variables  ( q,p ),  the  transformed  Hamiltonian  H(Q,P,t) 


H(Q,P,t)=^—e 

2m 


-r  t 


+ -2ip+ 


muj 


Vert 


is  a constant  of  motion  which  was  not  readily  apparent  when  using  the  original  Hamiltonian.  This  unexpected 
result  illustrates  the  usefulness  of  canonical  transformations  for  solving  dissipative  systems.  The  Hamilton- 
Jacobi  theory  now  can  be  used  to  solve  the  equations  of  motion  for  the  transformed  variables  ( Q , P)  plus  the 
transformed  Hamiltonian  H(Q,P,t).  The  derivative  of  the  generating  function 


dS 

dQ 


= P 


(ff) 


Use  equation  (g)  to  substitute  for  P in  the  Hamiltonian  H(Q,P,t)  (equation  (/)),  then  the  Hamilton- 
Jacobi  method  gives 


1 fdS\2  r dS  mw20 
2 m\dQJ  + 2QdQ+~ 
This  equation  is  separable  as  described  in  14.107  and  thus  let 


S(Q,  a,  t ) = W(Q,  a)  — at 


where  a is  a separation  constant.  Then 


1 fdW\ 
2m  \ dQ  J 


< 


mcog 

2 


Q2 


= a 


To  simplify  the  equations  define  the  variable  x as 


x = y/muioQ 


(h) 


(i) 


then  equation  (h)  can  be  written  as 


LOW 


. dW  , o 

- Ax— h (x2 

ox  v 


-B)  =0 


(J) 


where  A = ^ and  B = Assume  initial  conditions  g(0)  = go  and  g(0)  = 0 

For  this  case  the  separation  constant  a > 0 therefore  B > 0.  Note  that  equation  (j)  is  a simple  second- 
order  algebraic  relation,  the  solution  of  which  is 


dW 

dx 


(k) 


The  choice  of  the  sign  is  irrelevant  for  this  case  and  thus  the  positive  sign  is  chosen.  There  are  three  possible 
cases  for  the  solution  depending  on  whether  the  square-root  term  is  real,  zero,  or  imaginary. 

< 1 


Case  1:  -j  < 1,  that  is, 
Define  C = 


A 


2mwo 


1 — (4)'  Then  equation  (k)  can  be  integrated  to  give 


^{B  - C2x2)dx 


(0 


and 


This  integral  gives 


dS  1 f dx 

t+^J  ~^{B  - C2x2) 


sin 


= Cluq  if  + P)  = wt  + 6 
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where 

Transforming  back  to  the  original  variable  q gives 

q(t)  = Ge~ sin  (wt  + S) 


(m) 


(n) 


where  G and  S are  given  by  the  initial  conditions.  Equation  m is  identical  to  the  solution  for  the  underdamped 
linearly-damped  linear  oscillator  given  previously  in  equation  3.35. 

Case  2;  4 = 1,  that  is,  44  = 1 


In  this  case  C = 


= 0 and  thus  equation  k simplifies  to 


n Ax 2 /—- 

S = — at (-  xC  B 

4 


and 


Therefore  the  solution  is 


f3_  — _-t+  X 

dot  ioo\[B 

q(t)  = e"T  (F  + Gt) 


(o) 


where  F and  G are  constants  given  by  the  initial  conditions.  This  is  the  solution  for  the  critically-damped 
linearly-damped,  linear  oscillator  given  previously  in  equation  3.38. 

Case  3:  4 > 1>  that  is,  44  > 1 


Define 


a real  constant  D where  D = 


= iC,  then 


S = —at  — 


y/{B  + D2x2)dx 


Then 


This  last  integral  gives 


where 


dS  1 I"  dx 

da~  + wo  J VP  + £>2^2) 


Dcoq  (t  -j-  /?)  = Lot  -f-  5 


UJ  = OJqC  = CUq  \ 


( A 


\2mu>o 


Then  the  original  variable  gives 


q(t)  = Ge  2 sinh  (cot  + 6) 


(0 


This  is  the  classic  solution  of  the  overdamped  linearly -damped,  linear  harmonic  oscillator  given  previously  in 
equation  3.37.  The  canonical  transformation  from  a non- autonomous  to  an  autonomous  system  allowed  use 
of  Hamiltonian  mechanics  to  solve  the  damped  oscillator  problem. 

Note  that  this  example  used  Bateman’s  non-standard  Lagrangian,  and  corresponding  Hamiltonian,  for 
handling  a dissipative  linear  oscillator  system  where  the  dissipation  depends  linearly  on  velocity.  This  non- 
standard Lagrangian  led  to  the  correct  equations  of  motion  and  solutions  when  applied  using  either  the 
time- dependent  Lagrangian,  or  time- dependent  Hamiltonian,  and  these  solutions  agree  with  those  given  in 
chapter  3.5  ivhich  were  derived  using  Newtonian  mechanics. 
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14.4.4  Visual  representation  of  the  action  function  S. 


The  important  role  of  the  action  integral  S can  be  illu- 
minated by  considering  the  case  of  a single  point  mass 
m moving  in  a time  independent  potential  U(r).  Then 
the  action  reduces  to 


S(q,  a,t)  = W(q,  a)  — Et  (14.112) 


Let  qi  = x,  q2  = y,  <73  = z,  pi  = px,  p2  = Py,  P3  = Pz- 
The  momentum  components  are  given  by 


dW(q,a) 

Pi  — 

uC[i 

which  corresponds  to 


(14.113) 


p = VW  = VS  (14.114) 


That  is,  the  time-independent  Hamilton- Jacobi  equation 

1S  1 2 Figure  14.2:  Surfaces  of  constant  action  integral  S 

|V1T|  +U(r)  = E (14.115)  (dashed  lines)  and  the  corresponding  particle  nro- 

mi  , , . , menta  (solid  lines)  with  arrows  showing  the  direc- 

Ihis  implies  that  the  particle  momentum  is  given  by 

the  gradient  of  Hamilton’s  characteristic  function  and  is 

perpendicular  to  surfaces  of  constant  W as  illustrated  in 

figure  14.2.  The  constant  W surfaces  are  time  dependent  as  given  by  equation  (14.101) . Thus,  if  at  time 
t = 0 the  equi-action  surface  So(q,t)  = Wo(q,Pi)  = 0,  then  at  t = 1 the  same  surface  So(q,t)  = 0 now 
coincides  with  the  So(q,t)  = E surface  etc.  That  is,  the  equi-action  surfaces  move  through  space  separately 
from  the  motion  of  the  single  point  mass. 

The  above  pictorial  representation  is  analogous  to  the  situation  for  motion  of  a wavefront  for  electromag- 
netic waves  in  optics,  or  matter  waves  in  quantum  physics  where  the  wave  equation  separates  into  the  form 
(j>  = <)>0eV  = (j)0e^k'r~ut\  Hamilton’s  goal  was  to  create  a unified  theory  for  optics  that  was  equally  applica- 
ble to  particle  motion  in  classical  mechanics.  Thus  the  optical-mechanical  analogy  of  the  Hamilton- Jacobi 
theory  has  culminated  in  a universal  theory  that  describes  wave-particle  duality;  this  was  a Holy  Grail  of 
classical  mechanics  since  Newton’s  time.  It  played  an  important  role  in  development  of  the  Schrodinger 
representation  of  quantum  mechanics. 


14.4.5  Advantages  of  Hamilton- Jacobi  theory 

Initially,  only  a few  scientists,  like  Jacobi,  recognized  the  advantages  of  Hamiltonian  mechanics.  In  1843 
Jacobi  made  some  brilliant  mathematical  developments  in  Hamilton-Jacobi  theory  greatly  enhancing  ex- 
ploitation of  Hamiltonian  mechanics.  Hamilton-Jacobi  theory  now  serves  as  a foundation  for  contemporary 
physics,  such  as  quantum  and  statistical  mechanics.  A major  advantage  of  Hamilton-Jacobi  theory,  com- 
pared to  other  formulations  of  analytic  mechanics,  is  that  it  provides  a single , first-order  partial  differential 
equation  for  the  action  S,  which  is  a function  of  the  n generalized  coordinates  q and  time  t.  The  generalized 
momenta  no  longer  appear  explicitly  in  the  Hamiltonian  in  equations  14.94, 14.95.  Note  that  the  generalized 
momentum  do  not  explicitly  appear  in  the  equivalent  Euler-Lagrange  equations  of  Lagrangian  mechanics, 
but  these  comprise  a system  of  n second-order,  partial  differential  equations  for  the  time  evolution  of  the 
generalized  coordinate  q.  Hamilton’s  equations  of  motion  are  a system  of  2 n first-order  equations  for  the 
time  evolution  of  the  generalized  coordinates  and  their  conjugate  momenta. 

An  important  advantage  of  the  Hamilton-Jacobi  theory  is  that  it  provides  a formulation  of  classical 
mechanics  in  which  motion  of  a particle  can  be  represented  by  a wave.  In  this  sense,  the  Hamilton-Jacobi 
equation  fulfilled  a long-held  goal  of  theoretical  physics,  that  dates  back  to  Johann  Bernoulli,  of  finding  an 
analogy  between  the  propagation  of  light  and  the  motion  of  a particle.  This  goal  motivated  Hamilton  to 
develop  Hamiltonian  mechanics.  A consequence  of  this  wave-particle  analogy  is  that  the  Hamilton-Jacobi 
formalism  featured  prominently  in  the  derivation  of  the  Schrodinger  equation  during  the  development  of 
quantum-wave  mechanics. 
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14.5  Action-angle  variables 

14.5.1  Canonical  transformation 


Systems  possessing  periodic  solutions  are  a ubiquitous  feature  in  physics.  The  periodic  motion  can  be  either 
an  oscillation,  for  which  the  trajectory  in  phase  space  is  a closed  loop  (libration),  or  rolling  (rotational) 
motion  as  discussed  in  chapter  3.4.4.  For  many  problems  involving  periodic  motion,  the  interest  often  lies  in 
the  frequencies  of  motion  rather  than  the  detailed  shape  of  the  trajectories  in  phase  space.  The  action-angle 
variable  approach  uses  a canonical  transformation  to  action  and  angle  variables  which  provide  a powerful,  and 
elegant  method  to  exploit  Hamiltonian  mechanics.  In  particular,  it  can  determine  the  frequencies  of  periodic 
motion  without  having  to  calculate  the  exact  trajectories  for  the  motion.  This  method  was  introduced  by 
the  French  astronomer  Ch.  E.  Delaunay(1816  — 1872)  for  applications  to  orbits  in  celestial  mechanics,  but 
it  has  equally  important  applications  beyond  celestial  mechanics  such  as  to  bound  solutions  of  the  atom  in 
quantum  mechanics. 

The  action-angle  method  replaces  the  momenta  in  the  Hamilton- Jacobi  procedure  by  the  action  phase 
integral  for  the  closed  loop  (libration)  trajectory  in  phase  space  defined  by 


Ji  = 


(14.116) 


where  for  each  cyclic  variable  the  integral  is  taken  over  one  complete  period  of  oscillation.  The  cyclic  variable 
is  called  the  action  variable  where 

I,  = ^ Ji  = pdqi  (14.117) 

The  canonical  variable  to  the  action  variable  I is  the  angle  variable  4>.  Note  that  the  name  "action  variable" 
is  used  to  differentiate  I from  the  action  functional  S = f Ldt  which  has  the  same  units;  i.e.  angular 
momentum. 

The  general  principle  underlying  the  use  of  action-angle  variables  is  illustrated  by  considering  one  body, 
of  mass  m,  subject  to  a one-dimensional  bound  conservative  potential  energy  U(q).  The  Hamiltonian  is 
given  by 

H(p,q)  = ^-i+u(q)  (14.118) 

This  bound  system  has  a (q,p)  phase  space  contour  for  each  energy  H = E. 

p(q,  E)=±  V2 m(E  - U(q))  (14.119) 


For  an  oscillatory  system  the  two- valued  momentum  of  equation  14.119  is  non-trivial  to  handle.  By  contrast, 
the  area  J = (j)pdq  of  the  closed  loop  in  phase  space  is  a single- valued  scalar  quantity  that  depends  on  E 


and  U(q).  Moreover,  Liouville’s  theorem  states  that  the  area  of  the  closed  contour  in  phase  space  J = (fpdq 


is  invariant  to  canonical  transformations.  These  facts  suggest  the  use  of  a new  pair  of  conjugate  variables, 
{(j),  I),  where  I{E)  uniquely  labels  the  trajectory,  and  corresponding  area,  of  a closed  loop  in  phase  space 
for  each  value  of  E , and  the  single- valued  function  <fi  is  a corresponding  angle  that  specifies  the  exact  point 
along  the  phase-space  contour  as  illustrated  in  Fig  14.3. 

For  simplicity  consider  the  linear  harmonic  oscillator  where 


U{q)  = ^mui2q2 


(14.120) 


Then  the  Hamiltonian,  14.118  equals 

H{p,q)  = ?-  + \mu2q2  (14.121) 

2m  2 

Hamilton’s  equations  of  motion  give  that 

dH  „ 

p = — — — = —muj2q  (14.122) 

oq 

. _ dH  _ p 
f dp  m 


(14.123) 
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The  solution  of  equations  14.122  and  14.123  is  of  the  form 

q = C cos(w(t  — t0))  (14.124) 

p = — mwC  sin  w(t  — to)  (14.125) 

where  C,  and  to  are  integration  constants.  For  the  harmonic  oscillator, 
equations  14.124  and  14.125  correspond  to  the  usual  elliptical  contours 
in  phase  space,  as  illustrated  in  figure  14.3. 

The  action-angle  canonical  transformation  involves  making  the 
transform 

(q,P)  (14.126) 

where  I is  defined  by  equation  14.117  and  the  angle  being  the  cor- 
responding canonical  angle.  The  logical  approach  to  this  canonical 
transformation  for  the  harmonic  oscillator  is  to  define  q and  p in 
terms  of  <f>  and  I 


q = \ COS0 

V mw 

p = \/2 mlu)  sin  (j) 

Note  that  the  Poisson  bracket  is  unity 


(14.127) 

(14.128) 


[9>P](*,j)  = 1 

which  implies  that  the  above  transformation  is  canonical,  and  thus 

the  phase  sp.ee  area  /(E)  S i/Pd,  is  conserved. 

For  this  canonical  transformation  the  transformed  Hamiltonian 
Ti  (c i i,  I)  is 


4 


1 12/ 

Ti  (</>,  I)  = - — (2 tow/)  sin1 2  (j)  + -mu2 cos2  <j>  = wl  (14.129) 

2m  2 mw 

Note  that  this  Hamiltonian  is  a constant  that  is  independent  of  the 
angle  </>,  and  thus  Hamilton’s  equations  of  motion  give 


&H  (</>,!) 

def) 

dTi  (</>,  i)  _ 

— "77 — CJ 


(14.130) 

(14.131) 


Thus  we  have  mapped  the  harmonic  oscillator  to  new  coordinates 
( (f> , J)  where 


j = n(<t>,i)  _ e 

UJ  UJ 

(j)  = u(t-t0) 


(14.132) 

(14.133) 


Figure  14.3:  The  potential  energy 
V{q),  (upper)  and  corresponding 
phase  space  (p,  q)  (middle)  for  the 
harmonic  oscillator  at  four  equally 
spaced  total  energies  E.  The  corre- 
sponding action-angles  (J  <f>)  result- 
ing from  a canonical  transformation 
of  this  system  are  shown  in  the  lower 
plot. 


That  is,  the  phase  space  has  been  mapped  from  ellipses,  with  area  proportional  to  E in  the  (q,p)  phase 
space,  to  a cylindrical  {cj>,  I)  phase  space  where  / = ^ are  constant  values  that  are  independent  of  the  angle, 
while  <fi  increases  linearly  with  time.  Thus  the  variables  (q,p)  are  periodic  with  modulus  A cf>  = 2tt. 


q(0  + 2tt,7)  = q{<j>,I)  (14.134) 

p(0  + 2tt,7)  = p(<j),I)  (14.135) 


The  period  r of  the  periodic  oscillatory  motion  is  given  simply  by  A <f>  = 27 r = wr  which  is  the  well  known  re- 
sult for  the  harmonic  oscillator.  Note  that  the  action-angle  variable  canonical  transformation  has  determined 
the  frequency  of  the  periodic  motion  without  solving  the  detailed  trajectory  of  the  motion. 
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The  above  example  of  the  harmonic  oscillator  has  shown  that,  for  integrable  periodic  systems,  it  is 
possible  to  identify  a canonical  transformation  to  (fa  I)  such  that  the  Hamiltonian  is  independent  of  the 
angle  <fi  which  specifies  the  instantaneous  location  on  the  constant  energy  contour  I.  If  the  phase  space 
contour  is  a separatrix,  then  it  divides  phase  space  into  invariant  regions  containing  phase-space  contours 
with  differing  behavior.  The  action-angle  variables  are  not  useful  for  separatrix  contours.  For  rolling  motion, 
the  system  rotates  with  continuously  increasing,  or  decreasing  angle,  and  there  is  no  natural  boundary  for  the 
action  angle  variable  since  the  phase  space  trajectory  is  continuous  and  not  closed.  However,  the  action-angle 
approach  still  is  valid  if  the  motion  involves  periodic  as  well  as  rolling  motion. 

The  example  of  the  one-dimensional,  one-body,  harmonic  oscillator  can  be  expanded  to  the  more  general 
case  for  many  bodies  in  three  dimensions.  This  is  illustrated  by  considering  multiple  periodic  systems  for 
which  the  Hamiltonian  is  conservative  and  where  the  equations  of  the  canonical  transformation  are  separable. 
The  generalized  momenta  then  can  be  written  as 


Pi  = 


dWi(qi]a1,a2,..an) 

dqi 


(14.136) 


for  which  each  pt  is  a function  of  qi  and  the  n integration  constants  ay 


Pi  — Pi  (qii  O'  1 5 ^2 , -Oln) 


(14.137) 


The  momentum  pi  ( qt , a±,  a2,  ..a„)  represents  the  trajectory  of  the  system  in  the  ( qi,Pi ) phase  space  that  is 
characterized  by  Hamilton’s  characteristic  function  W(q,  J).  Combining  equations  14.116, 14.136  gives 


Ji  = 


dWi(qf,a  1,0:2,  -cen) 
dqi 


dqi 


(14.138) 


Since  is  merely  a variable  of  integration,  each  active  action  variable  Ji  is  a function  of  the  n constants 
of  integration  in  the  Hamilton-Jacobi  equation.  Because  of  the  independence  of  the  separable-variable  pairs 
( qi,Pi ),  the  Ji  form  n independent  functions  of  the  a*,  and  hence  are  suitable  for  use  as  a new  set  of  constant 
momenta.  Thus  the  characteristic  function  W can  be  written  as 


W (gi,  ...qn\  Ji, ...  Jn)  = ^2  Wi  fe;  Ju  —Jn)  (14.139) 

3 

while  the  Hamiltonian  is  only  a function  of  the  momenta  H (Ji,  ....Jn) 

The  generalized  coordinate,  conjugate  to  J,  is  known  as  the  angle 
transformation  equation 

^ _ dW  _^dWj(qj\J1,...Jn) 

**  dJt  ^ dJi 

3= 1 

The  corresponding  equation  of  motion  for  <fr  is  given  by 

fa  = J — = 27rWi(Ji, ...  Jn)  (14.141) 

where  uy(J)  are  constant  functions  of  the  action  variables  Jj  with  a solution 


variable  fa  which  is  defined  by  the 

(14.140) 


<t>i  = 27 TUJit  + Pi 


(14.142) 


that  is,  they  are  linear  functions  of  time.  The  constants  uy  can  be  identified  with  the  frequencies  of  the 
multiple  periodic  motions. 

The  action-angle  variables  appear  to  be  no  different  than  a particular  set  of  transformed  coordinates. 
Their  merit  appears  when  the  physical  interpretation  is  assigned  to  ay.  Consider  the  change  5 fa  as  the  qj 
are  changed  infinitesimally 


xj.  a d2W 

^ = E ^ = E 


(14.143) 


The  derivative  with  respect  to  qi  vanishes  except  for  the  Wj  component  of  W.  Thus  equation  14.143  reduces 
to 
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Hi  = 


d 

dlt 


pi  fe’ J) 

3 


Therefore,  the  total  change  in  q f as  the  system  goes  through  one  complete  cycle  is 


Pj  ( Pj  • •!)  dqj  — 2n5ij 


(14.144) 


(14.145) 


where  fj-  is  outside  the  integral  since  the  Ji  are  constants  for  cyclic  motion.  Thus  A <j)i  = 2tt  = ojjT,  where 
t i is  the  period  for  one  cycle  of  oscillation,  where  the  angular  frequency  Ui  is  given  by 

S = zr  (14-146) 


Thus  the  frequency  v associated  with  the  periodic  motion  is  the  reciprocal  of  the  period  r.  The  secret  here  is 
that  the  derivative  of  H with  respect  to  the  action  variable  J given  by  equation  (14.141)  directly  determines 
the  frequency  of  the  periodic  motion  without  the  need  to  solve  the  complete  equations  of  motion.  Note  that 
multiple  periodic  motion  can  be  represented  by  a Fourier  expansion  of  the  form 


qk 


r E - E 


,fc  p27ri(jio;i+i2W2+i3W3  + --+inWn) 

blvUn 


jl=-00J2  = -00  jn  = ~ OO 


(14.147) 


Although  the  action-angle  approach  to  Hamilton-Jacobi  theory  does  not  produce  complete  equations  of 
motion,  it  does  provide  the  frequency  decomposition  that  often  is  the  physics  of  interest.  The  reason  that 
the  powerful  action-angle  variable  approach  has  been  introduced  here  is  that  it  is  used  extensively  in  celestial 
mechanics.  The  action-angle  concept  also  played  a key  role  in  the  development  of  quantum  mechanics,  in 
that  Sommerfelcl  recognized  that  Bohr’s  ad  hoc  assumption  that  angular  momentum  is  quantized,  could  be 
expressed  in  terms  of  quantization  of  the  angle  variable  as  is  mentioned  in  chapter  17. 


14.5.2  Adiabatic  invariance  of  the  action  variables 

When  the  Hamiltonian  depends  on  time  it  can  be  quite  difficult  to  solve  for  the  motion  because  it  is  hard 
to  find  constants  of  motion  for  time-dependent  systems.  However,  if  the  time  dependence  is  sufficiently 
slow,  that  is,  if  the  motion  is  adiabatic,  then  there  exist  dynamical  variables  that  are  almost  constant  which 
can  be  used  to  solve  for  the  motion.  In  particular,  such  approximate  constants  are  the  familiar  action-angle 
integrals.  The  adiabatic  invariance  of  the  action  variables  played  an  important  role  in  the  development  of 
quantum  mechanics  at  the  1911  Solvay  Conference.  This  was  a time  when  physicists  were  grappling  with 
the  concepts  of  quantum  mechanics.  Einstein  used  the  following  classical  mechanics  example  of  adiabatic 
invariance,  applied  to  the  simple  pendulum,  in  order  to  illustrate  the  concept  of  adiabatic  invariance  of  the 
action.  This  example  demonstrates  the  power  of  using  action-angle  variables. 

14.17  Example:  Adiabatic  invariance  for  the  simple  pendulum 

Consider  that  the  pendulum  is  made  up  of  a point  mass  M suspended  from  a pivot  by  a light  string  of 
length  L that  is  swinging  freely  in  a vertical  plane.  Derive  the  dependence  of  the  amplitude  of  the  oscillations 
9,  assuming  9 is  small,  if  the  string  is  very  slowly  shortened  by  a factor  of  2,  that  is,  assume  that  the  change 
in  length  during  one  period  of  the  oscillation  is  very  small. 

The  tension  in  the  string  T is  given  by 


T = Mg  (cosd) 


ML292 

L 


Let  the  pendulum  angle  be  oscillatory 


9 = 9 o cos  (uit  + (p0) 
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Then  the  average  mean  square  amplitude  and  velocity  over  one  period  are 

(92)  = <[0ocos(wf  + ^o)]2>  = y 

= ([— 6>0wsin(wf  + ^0)]2>  = 

Since,  for  the  simple  pendulum,  w2  = j-,  then  the  tension  in  the  string 

T = Mg(  1 - + ML  (r)  = Mg(  1 + 6-f) 


Assuming  that  9q  is  a small  angle,  and  that  the  change  in  length  —A L is  very  small  during  one  period 
t,  then  the  work  done  is 


0Z 

ATV  = TAL  = - MgAL  - Mg-A-AL 


(a) 


while  the  change  in  internal  oscillator  energy  is 
A(— MgL  cos  9q)  = A 


92  1 


~MgL(  1 - 


= - MgAL  + i MgA(L9l ) = -MgAL  + i Mg920AL  + MgL90A90 


The  work  done  must  balance  the  increment  in  internal  energy  therefore 

L90A90  + = o 


(b) 


or 

Therefore  it  follows  that 


L920A\n(90Li)  = 0 
(' 90L 5)  = constant 


(c) 


or 

9o  oc  L~~l 

Thus  shortening  the  length  of  the  pendulum  string  from  L to  ^ adiabatically  corresponds  to  the  amplitude 
increasing  by  a factor  1.68. 

Consider  the  action-angle  integral  for  one  closed  period  r = Af  for  this  problem 


J 


/ 


Ped9 


= (f  ML26  ■ 9dt 


J 


= ML2 


2n 

u 


= 7 tML29qU 

= irMg^d^L^  = constant 


where  that  last  step  is  due  to  equation  (c). 

The  above  example  shows  that  the  action  integral  J = constant,  that  is,  it  is  invariant  to  an  adiabatic 
change.  In  retrospect  this  result  is  as  expected  in  that  the  action  integral  should  be  minimized. 
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14.6  Canonical  perturbation  theory 


Most  examples  in  classical  mechanics  discussed  so  far  have  been  capable  of  exact  solutions.  In  real  life,  the 
majority  of  problems  cannot  be  solved  exactly.  For  example,  in  celestial  mechanics  the  two-body  Kepler 
problem  can  be  solved  exactly,  but  solution  of  the  three-body  problem  is  intractable.  Typical  systems  in 
celestial  mechanics  are  never  as  simple  as  the  two-body  Kepler  system  because  of  the  influence  of  additional 
bodies.  Fortunately  in  most  cases  the  influence  of  additional  bodies  is  sufficiently  small  to  allow  use  of 
perturbation  theory.  That  is,  the  restricted  three-body  approximation  can  be  employed  for  which  the  system 
is  reduced  to  considering  it  as  an  exactly  solvable  two-body  problem,  subject  to  a small  perturbation  to  this 
solvable  two-body  system.  Note  that  even  though  the  change  in  the  Hamiltonian  due  to  the  perturbing  term 
may  be  small,  the  impact  on  the  motion  can  be  especially  large  near  a resonance. 

Consider  the  Hamiltonian,  subject  to  a time-dependent  perturbation,  is  written  as 


H(q,p,t)  = H0(q,p,t ) + A H(q,p,t) 


where  H0(q,p,t)  designates  the  unperturbed  Hamiltonian  and  A H(q,p,t)  designates  the  perturbing  term. 
For  the  unperturbed  system  the  Hamilton-Jacobi  equation  is  given  by 


, dS  dS  . dS 

H(Qi,Pi,t)  = H0{qi,...qn;  + -p  = 0 


(14.90) 


where  S(qi,Pi,t ) is  the  generating  function  for  the  canonical  transformation  ( q,p ) — > ( Q,P ).  The  perturbed 
S(qi,Pi,t ) remains  a canonical  transformation,  but  the  transformed  Hamiltonian  H(Qi,  P%,t)  7^  0.  That  is, 


H(Qi,Pi,t ) = H0  + AH(q,p,t)  + 


dS 

~9t 


A H(q,p,t) 


The  equations  of  motion  satisfied  by  the  transformed  variables  now  are 


Qi 


Pi 


dAH 

~dP~ 

dAH 

dQi 


(14.148) 


(14.149) 


These  equations  remain  as  difficult  to  solve  as  the  full  Hamiltonian.  However,  the  perturbation  technique 
assumes  that  AH  is  small,  and  that  one  can  neglect  the  change  of  (Qi,  Pt)  over  the  perturbing  interval. 
Therefore,  to  a first  approximation,  the  unperturbed  values  of  ^§pr-  and  can  be  used  in  equations  14.149. 
A detailed  explanation  of  canonical  perturbation  theory  is  presented  in  chapter  12  of  Goldstein[Go50]. 


14.18  Example:  Harmonic  oscillator  perturbation 


(a)  Consider  first  the  Hamilton-Jacobi  equation  for  the  generating  function  S(q,a,t ) for  the  case  of  a 
single  free  particle  subject  to  the  Hamiltonian  H = \p2 . Find  the  canonical  transformation  q = q(/3,a)  and 
p = p(/3,  a)  where  (3  and  a are  the  transformed  coordinate  and  momentum  respectively. 

The  Hamilton-Jacobi  equation 

dS  TT,  . n 
— + H(q,p,t)  = 0 

Using  p = in  the  Hamiltonian  H = \p2  gives 


dS 

~dt 


1 / dS\ 

2 


= 0 


Since  H does  not  depend  on  q,  t explicitly,  then  the  two  terms  on  the  left  hand  side  of  the  equation  can  be 
set  equal  to  —7, 7 respectively,  where  7 is  at  most  a function  of  p.  Then  the  generating  function  is 

S = sfiPf q - o d 


Set  a = v/2;y  then  the  generating  function  can  be  written  as 
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The  constant  a can  be  identified  with  the  new  momentum  P.  Then  the  transformation  equations  become 


dS 

p=  — = a 
oq 


„ dS  dS 

Q = ap  = sr,=q-at=fi 


That  is 

q = f3  + at 

which  corresponds  to  motion  with  a uniform  velocity  a in  the  q,p  system. 

2 

(b ) Consider  that  the  Hamiltonian  is  perturbed  by  addition  of  potential  U = which  corresponds  to  the 


harmonic  oscillator.  Then 


Consider  the  transformed  Hamiltonian 


i?  = -p2  + — 
2 


_ rr  , dS  _ 1 2 , h2  a2  _q2  _1  2 

n-H  + ^~2p  +Y~T-Y-2{l3  + at) 


Hamilton’s  equations  of  motion 


give  that 


y dP 


P = 


dH 

dQ 


These  two  equations  can  be  solved  to  give 


jd  = {fd  + at)  t 

a = — ((3  + at) 

a a = 0 


which  is  the  equation  of  a harmonic  oscillator  showing  that  a is  harmonic  of  the  form  a = ao  sin  (t  + 5) 
where  ao,S  are  constants  of  motion.  Thus 

fi  = —a  — t = — ao[cos(t  + <5)  + tsin(t  + (5)] 

The  transformation  equations  then  give 

p = a = ao  sin  (t  + 6) 
q = fd  + at  = —a  = — ao  cos (t  + 5) 

Hence  the  solution  for  the  perturbed  system  is  harmonic,  which  is  to  be  expected  since  the  potential  has  a 
quadratic  dependence  of  position. 


14.19  Example:  Lindblad  resonance  in  planetary  and  galactic  motion 

Use  of  canonical  perturbation  theory  in  celestial  mechanics  has  been  exploited  by  Professor  Alice  Quillen 
and  her  group.  They  combine  use  of  action-angle  variables  and  Hamilton- Jacobi  theory  to  investigate  the  role 
of  Lindblad  resonance  to  planetary  motion,  and  also  for  stellar  motion  in  galaxies.  A Lindblad  resonance 
is  an  orbital  resonance  in  which  the  orbital  period  of  a celestial  body  is  a simple  multiple  of  some  forcing 
frequency.  Even  for  very  weak  perturbing  forces,  such  resonance  behavior  can  lead  to  orbit  capture  and  chaotic 
motion. 

For  planetary  motion  the  planet  masses  are  about  1/1000  that  of  the  central  star,  so  the  perturbations 
to  Kepler  orbits  are  small.  However,  Lindblad  resonance  for  planetary  motion  led  to  Saturn’s  rings  which 
result  from  perturbations  produced  by  the  moons  of  Saturn  that  skulpt  and  clear  dust  rings.  Stellar  orbits  in 
disk  galaxies  are  perturbed  a few  percent  by  non  axially-symmetric  galactic  features  such  as  spiral  arms  or 
bars.  Lindblad  resonances  perturb  stellar  motion  and  drive  spiral  density  waves  at  distances  from  the  center 
of  a galactic  disk  where  the  natural  frequency  of  the  radial  component  of  a star’s  orbital  velocity  is  close  to 
the  frequency  of  the  fluctuations  in  the  gravitational  field  due  to  passage  through  spiral  arms  or  bars.  If  a 
stars  orbital  speed  around  a galactic  center  is  greater  than  that  of  the  part,  of  a spiral  arm  through  which  it  is 
traversing,  then  an  inner  Lindblad  resonance  occurs  which  speeds  up  the  star’s  orbital  speed  moving  the  orbit 
outwards.  If  the  orbital  speed  is  less  than  that  of  a spiral  arm,  an  inner  Lindblad  resonance  occurs  causing 
inward  movement  of  the  orbit. 


430 


CHAPTER  14.  ADVANCED  HAMILTONIAN  MECHANICS 


14.7  Symplectic  representation 


The  Hamilton’s  first-order  equations  of  motion  are  symmetric  if  the  generalized  and  constraint  force  terms, 
in  equation  14.9,  are  excluded. 


(14.4) 


This  stimulated  attempts  to  treat  the  canonical  variables  (q,  p)  in  a symmetric  form  using  group  theory. 
Some  graduate  textbooks  in  classical  mechanics  have  adopted  use  of  symplectic  symmetry  in  order  to  unify 
the  presentation  of  Hamiltonian  mechanics.  For  a system  of  n degrees  of  freedom,  a column  matrix  t)  is 
constructed  that  has  2 n elements  where 


Vj  = <h 


Vn+j  Pj 


j <n 


(14.150) 


Therefore  the  column  matrix 

'dH 

dr) 


dH 

dqj 


dH\  _ dH 


dr)  J 


n+j 


dpj 


j < n 


(14.151) 


The  symplectic  matrix  J is  defined  as  being  a 2 n by  2 n skew-symmetric,  orthogonal  matrix  that  is  broken 
into  four  n x n null  or  unit  matrices  according  to  the  scheme 


J = 


_ M°1  + t1] 

- [1]  [0] 


(14.152) 


where  [0]  is  the  n-dimension  null  matrix,  for  which  all  elements  are  zero.  Also  [1]  is  the  n-dimensional  unit 
matrix,  for  which  the  diagonal  matrix  elements  are  unity  and  all  off-diagonal  matrix  elements  are  zero.  The 
J matrix  accounts  for  the  opposite  signs  used  in  the  equations  for  q and  p.  The  symplectic  representation 
allows  the  Hamilton’s  equations  of  motion  to  be  written  in  the  compact  form 

r)  T-f 

r)  = 3—  (14.153) 

This  textbook  does  not  use  the  elegant  symplectic  representation  since  it  excludes  the  important  gener- 
alized forces  and  Lagrange  multiplier  forces. 


14.8  Comparison  of  the  Lagrangian  and  Hamiltonian  formulations 

Common  features 

The  discussion  of  Lagrangian  and  Hamiltonian  dynamics  has  illustrated  the  power  of  such  algebraic  formu- 
lations. Both  approaches  are  based  on  application  of  variational  principles  to  scalar  energy  which  gives  the 
freedom  to  concentrate  solely  on  active  forces  and  to  ignore  internal  forces.  Both  methods  can  handle  many- 
body  systems  and  exploit  canonical  transformations,  which  are  impractical  or  impossible  using  the  vectorial 
Newtonian  mechanics.  These  algebraic  approaches  simplify  the  calculation  of  the  motion  for  constrained 
systems  by  representing  the  vector  force  fields,  as  well  as  the  corresponding  equations  of  motion,  in  terms  of 
either  the  Lagrangian  function  L(q,  q,f)  or  the  action  functional  ffyq,  p,t)  which  are  related  by  the  definite 
integral 

S(q,P,t)=[  T(q,q,f)cft  (14.1) 

Jtx 

The  Lagrangian  function  L(q,  q,t),  and  the  action  functional  5'(q,  p,f),  are  scalar  functions  under  rotation, 
but  they  determine  the  vector  force  fields  and  the  corresponding  equations  of  motion.  Thus  the  use  of 
rotationally-invariant  functions  L(q,  q,t)  and  5(q,  p,t)  provide  a simple  representation  of  the  vector  force 
fields.  This  is  analogous  to  the  use  of  scalar  potential  fields  </>  (q,  t)  to  represent  the  electrostatic  and  gravita- 
tional vector  force  fields.  Like  scalar  potential  fields,  Lagrangian  and  Hamiltonian  mechanics  represents  the 
observables  as  derivatives  of  L(q,  q,i)  and  S(q,  p,f),  and  the  absolute  values  of  L(q,  q,t)  and  S(q,p,t)  are 
undefined;  only  differences  in  T(q,  q,f)  and  ffyq,  p,f)  are  observable.  For  example,  the  generalized  momenta 
are  given  by  the  derivatives  pi  = and  pj  = J^-.  The  physical  significance  of  the  least  action  S'(q,  a,t ) is 
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illustrated  when  the  canonically  transformed  momenta  P = a is  a constant.  Then  the  generalized  momenta 
and  the  Hamilton-Jacobi  equation,  imply  that  the  total  time  derivative  of  the  action  equals 

dS  8S  . dS  TT  T as 

Tt=dU‘'+M=m'-H  = L <14154) 

The  indefinite  integral  of  this  equation  reproduces  the  definite  integral  (14.1)  to  within  an  arbitrary  constant, 
i.e. 

>5'(cliP)=  / -b(q,  q,t)df  + constant  (14.155) 


Lagrangian  formulation: 

Consider  a system  with  n independent  generalized  coordinates,  plus  m constraint  forces  that  are  not  required 
to  be  known.  The  Lagrangian  approach  can  reduce  the  system  to  a minimal  system  of  s = n — m inde- 
pendent generalized  coordinates  leading  to  s = n — m second-order  differential  equations.  By  comparison, 
the  Newtonian  approach  uses  n + m unknowns.  Alternatively,  the  Lagrange  multipliers  approach  allows 
determination  of  the  holonomic  constraint  forces  resulting  in  s = n + m second  order  equations  to  determine 
s = n + m unknowns.  The  Lagrangian  potential  function  is  limited  to  conservative  forces,  but  generalized 
forces  can  be  used  to  handle  non-conservative  and  non-holonomic  forces.  The  advantage  of  the  Lagrange 
equations  of  motion  is  that  they  can  deal  with  any  type  of  force,  conservative  or  non-conservative,  and 
they  directly  determine  g,  q rather  than  q,p  which  then  requires  relating  p to  q.  The  Lagrange  approach  is 
superior  to  the  Hamiltonian  approach  if  a numerical  solution  is  required  for  typical  undergraduate  problems 
in  classical  mechanics.  However,  Hamiltonian  mechanics  has  a clear  advantage  for  addressing  more  profound 
and  philosophical  questions  in  physics. 


Hamiltonian  formulation: 

For  a system  with  n independent  generalized  coordinates,  and  m constraint  forces,  the  Hamiltonian  approach 
determines  2 n first-order  differential  equations.  In  contrast  to  Lagrangian  mechanics,  where  the  Lagrangian 
is  a function  of  the  coordinates  and  their  velocities,  the  Hamiltonian  uses  the  variables  q and  p,  rather 
than  velocity.  The  Hamiltonian  has  twice  as  many  independent  variables  as  the  Lagrangian  which  is  a great 
advantage,  not  a disadvantage,  since  it  broadens  the  realm  of  possible  transformations  that  can  be  used  to 
simplify  the  solutions.  Hamiltonian  mechanics  uses  the  conjugate  coordinates  q,  p,  corresponding  to  phase 
space.  This  is  an  advantage  in  most  branches  of  physics  and  engineering.  Compared  to  Lagrangian  mechanics, 
Hamiltonian  mechanics  has  a significantly  broader  arsenal  of  powerful  techniques  that  can  be  exploited  to 
obtain  an  analytical  solution  of  the  integrals  of  the  motion  for  complicated  systems.  These  techniques 
include,  the  Poisson  bracket  formulation,  canonical  transformations,  the  Hamilton-Jacobi  approach,  the 
action-angle  variables,  and  canonical  perturbation  theory.  In  addition,  Hamiltonian  dynamics  also  provides 
a means  of  determining  the  unknown  variables  for  which  the  solution  assumes  a soluble  form,  and  it  is 
ideal  for  study  of  the  fundamental  underlying  physics  in  applications  to  other  fields  such  as  quantum  or 
statistical  physics.  However,  the  Hamiltonian  approach  endemically  assumes  that  the  system  is  conservative 
putting  it  at  a disadvantage  with  respect  to  the  Lagrangian  approach.  The  appealing  symmetry  of  the 
Hamiltonian  equations,  plus  their  ability  to  utilize  canonical  transformations,  makes  it  the  formalism  of 
choice  for  examination  of  system  dynamics.  For  example,  Hamilton-Jacobi  theory,  action-angle  variables 
and  canonical  perturbation  theory  are  used  extensively  to  solve  complicated  multibody  orbit  perturbations 
in  celestial  mechanics  by  finding  a canonical  transformation  that  transforms  the  perturbed  Hamiltonian  to 
a solved  unperturbed  Hamiltonian. 

The  Hamiltonian  formalism  features  prominently  in  quantum  mechanics  since  there  are  well  established 
rules  for  transforming  the  classical  coordinates  and  momenta  into  linear  operators  used  in  quantum  me- 
chanics. The  variables  q,  q used  in  Lagrangian  mechanics  do  not  have  simple  analogs  in  quantum  physics. 
As  a consequence,  the  Poisson  bracket  formulation,  and  action-angle  variables  of  Hamiltonian  mechanics 
played  a key  role  in  development  of  matrix  mechanics  by  Heisenberg,  Born,  and  Dirac,  while  the  Hamilton- 
Jacobi  formulation  played  a key  role  in  development  of  Schrodinger’s  wave  mechanics.  Similarly,  Hamiltonian 
mechanics  is  the  preeminent  variational  approached  used  in  statistical  mechanics. 
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14.9  Summary 

This  chapter  has  gone  beyond  what  is  normally  covered  in  an  undergraduate  course  in  classical  mechanics, 
in  order  to  illustrate  the  power  of  the  remarkable  arsenal  of  methods  available  for  solution  of  the  equations  of 
motion  using  Hamiltonian  mechanics.  This  has  included  the  Poisson  bracket  representation  of  Hamiltonian 
formulation  of  mechanics,  canonical  transformations,  Hamilton- Jacobi  theory,  action-angle  variables,  and 
canonical  perturbation  theory.  The  purpose  was  to  illustrate  the  power  of  variational  principles  in  Hamil- 
tonian mechanics  and  how  they  relate  to  fields  such  as  quantum  mechanics.  The  following  are  the  key  points 
made  in  this  chapter. 


Poisson  brackets:  The  elegant  and  powerful  Poisson  bracket  formalism  of  Hamiltonian  mechanics  was 
introduced.  The  Poisson  bracket  of  any  two  continuous  functions  of  generalized  coordinates  F(p,q)  and 
G{p,  q),  is  defined  to  be 


[F,G\pq 


dF  dG 

dqi  dpi 


OF 

dpt  dqi ) 


(14.13) 


The  fundamental  Poisson  brackets  equal 


tek,qi]  = o 


(14.21) 


[Pk,Pl\  = 0 


(14.22) 


[qk,Pi]  = - [ Pi,qk ] = hi 


(14.23) 


The  Poisson  bracket  is  invariant  to  a canonical  transformation  from  ( q,p ) to  ( Q,P ).  That  is 


\F,G]qv 


dF  dG 
dQk  dPk 


dF  dG  \ 
dPkdQk) 


= [F-G}qp 


(14.32) 


There  is  a one-to-one  correspondence  between  the  commutator  and  Poisson  Bracket  of  two  independent 
functions, 

(F1G1-G1F1)  = X[F1,G1]  (14.38) 

where  A is  an  independent  constant.  In  particular  F\G\  commute  of  the  Poisson  Bracket  [i?i,G'i]  = 0. 


Poisson  Bracket  representation  of  Hamiltonian  mechanics:  It  has  been  shown  that  the  Poisson 
bracket  formalism  contains  the  Hamiltonian  equations  of  motion  and  is  invariant  to  canonical  transforma- 
tions. Also  this  formalism  extends  Hamilton’s  canonical  equations  to  non-commuting  canonical  variables. 
Hamilton’s  equations  of  motion  can  be  expressed  directly  in  terms  of  the  Poisson  brackets 


dH 

dpk 

(14.57) 

dH 

dqk 

(14.58) 

An  important  result  is  that  the  total  time  derivative  of  any  operator  is  given  by 


dG  dG 
— = — + [G,H} 


(14.45) 


Poisson  brackets  provide  a powerful  means  of  determining  which  observables  are  time  independent  and 
whether  different  observables  can  be  measured  simultaneously  with  unlimited  precision.  It  was  shown  that 
the  Poisson  bracket  is  invariant  to  canonical  transformations,  which  is  a valuable  feature  for  Hamiltonian 
mechanics.  Poisson  brackets  were  used  to  prove  Liouville’s  theorem  which  plays  an  important  role  in  the  use 
of  Hamiltonian  phase  space  in  statistical  mechanics.  The  Poisson  bracket  is  equally  applicable  to  continuous 
solutions  in  classical  mechanics  as  well  as  discrete  solutions  in  quantized  systems. 
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Canonical  transformations:  A transformation  between  a canonical  set  of  variables  (q,p)  with  Hamil- 

tonian H(q,p,t)  to  another  set  of  canonical  variable  (Q,P)  with  Hamiltonian  H(Q,P,t)  can  be  achieved 
using  a generating  functions  F such  that 


H(Q,P,t)=H(q,p,t)  + — 
Possible  generating  functions  are  summarized  in  the  following  table. 


(14.89) 


Generating  function 

Generating  function  derivatives 

Trivial  special 

case 

F = F|  (q,  Q,  t) 

v.  = 2*1 

Pl  dm 

p — 

1 dQi 

^1  — QiQi 

Qi  = Pi 

Pi  = -qi 

F = P2(q,P,f)  - Q - P 

IHKi 

F-2  = qiPi 

Qi  — Qi 

Pi  = Pi 

F = F3(p,  Q,t)  + q-p 

_ _cu 3 
y*  — dvi 

p _ Oh's 

~ dQi 

^3  — PiQi 

Qi  — Qi 

Pi  = - Pi 

F = F4(p,  P,t)  + q p Q P 

a - -Afi 

y*  dvi 

n — PFk 

— Qpi 

cC 

S 

II 

tC 

Qi  — Pi 

Pi  = -qi 

If  the  canonical  transformation  makes  7 i.(Q,P,t)  = 0 then  the  conjugate  variables  ( Q,P ) are  constants 
of  motion.  Similarly  if  H (Q,  P,  t)  is  a cyclic  function  then  the  corresponding  P are  constants  of  motion. 


Hamilton- Jacobi  theory:  Hamilton- Jacobi  theory  determines  the  generating  function  required  to  per- 
form canonical  transformations  that  leads  to  a powerful  method  for  obtaining  the  equations  of  motion  for 
a system.  The  Hamilton-Jacobi  theory  uses  the  action  function  S = F-2  as  a generating  function,  and  the 
canonical  momentum  is  given  by 


Pi  = 


dS_ 

dqi 


(14.4) 


This  can  be  used  to  replace  Pi  in  the  Hamiltonian  H leading  to  the  Hamilton-Jacobi  equation 


TT,  dS  . dS 

i,(,;aTt)+ aF  = ° 


(14.94) 


Solutions  of  the  Hamilton-Jacobi  equation  were  obtained  by  separation  of  variables.  The  close  optical- 
mechanical  analogy  of  the  Hamilton-Jacobi  theory  is  an  important  advantage  of  this  formalism  that  led  to 
it  playing  a pivotal  role  in  the  development  of  wave  mechanics  by  Schrodinger. 


Action-angle  variables:  The  action-angle  variables  exploits  a canonical  transformation  from  (q,p)  — » 
(0, 1)  where 

Ii  = -^Ji  = ^ j> Pidqi  (14.117) 

For  periodic  motion  the  phase-space  trajectory  is  closed  with  area  given  by  J and  this  area  is  conserved  for 
the  above  canonical  transformation.  For  a conserved  Hamiltonian  the  action  variable  / is  independent  of 
the  angle  variable  <f>.  The  time  dependence  of  the  angle  variable  <fi  directly  determines  the  frequency  of  the 
periodic  motion  without  recourse  to  calculation  of  the  detailed  trajectory  of  the  periodic  motion. 


Canonical  perturbation  theory:  Canonical  perturbation  theory  is  a valuable  method  of  handling  multi- 
body interactions.  The  adiabatic  invariance  of  the  action-angle  variables  provides  a powerful  approach  for 
exploiting  canonical  perturbation  theory. 


Comparison  of  Lagrangian  and  Hamiltonian  formulations:  The  remarkable  power,  and  intellectual 
beauty,  provided  by  use  of  variational  principles  to  exploit  the  underlying  principles  of  natural  economy  in 
nature,  has  had  a long  and  rich  history.  It  has  led  to  profound  developments  in  many  branches  of  theoretical 
physics.  However,  it  is  noted  that  although  the  above  algebraic  formulations  of  classical  mechanics  have  been 
used  for  over  two  centuries,  the  important  limitations  of  these  algebraic  formulations  to  non-linear  systems 
remain  a challenge  that  still  is  being  addressed. 

It  has  been  shown  that  the  Lagrangian  and  Hamiltonian  formulations  represent  the  vector  force  fields, 
and  the  corresponding  equations  of  motion,  in  terms  of  the  Lagrangian  function  L(q,  q,t),  or  the  action 
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functional  S(q,  p,t),  which  are  scalars  under  rotation.  The  Lagrangian  function  L(q,  q,t)  is  related  to  the 
action  functional  S(q,  p,t)  by 

S(q,p,t)=[  L(q,q,t)dt  (14.1) 

Jti 

These  functions  are  analogous  to  electric  potential,  in  that  the  observables  are  derived  by  taking  derivatives 
of  the  Lagrangian  function  L(q,  q ,t)  or  the  action  functional  <S(q,  p,t).  The  Lagrangian  formulation  is  more 
convenient  for  deriving  the  equations  of  motion  for  simple  mechanical  systems.  The  Hamiltonian  formulation 
has  a greater  arsenal  of  techniques  for  solving  complicated  problems  plus  it  uses  the  canonical  variables  (q-i./Pi) 
which  are  the  variables  of  choice  for  applications  to  quantum  mechanics  and  statistical  mechanics. 


14.9.  SUMMARY 
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Workshop  exercises 


1.  Poisson  brackets  are  a powerful  means  of  elucidating  when  observables  are  constant  of  motion  and  whether 
two  observables  can  be  simultaneously  measured  with  unlimited  precision.  Consider  a spherically  symmetric 
Hamiltonian 

v _ 1 („2  , Pi  , pl  \ , 


H=^\Pr  + ^ + + U (0 

2 to  \ H r2  sm  0 / 

for  a mass  in  where  U(r  is  a central  potential.  Use  the  Poisson  bracket  plus  the  time  dependence  to  determine 
the  following: 

(a)  Does  commute  with  H and  is  it  a constant  of  motion? 

2 

(b)  Does  pig  + q commute  with  H and  is  it  a constant  of  motion? 

(c)  Does  pr  commute  with  H and  is  it  a constant  of  motion? 

(d)  Does  p^  commute  with  pg  and  what  does  the  result  imply? 

2.  Consider  the  Poisson  brackets  for  angular  momentum  L 

(a)  Show  {Li,  rj  } = eijk'i'k  , where  the  Levi-Cevita  tensor  is, 


^ijk  — 


+1  if  ijk  are  cyclically  permuted 
— 1 if  ijk  are  anti-cyclically  permuted 
0 if  i = j or  i = k or  j = k 


(b)  Show  {Li,pj}  = €ijkPk  ■ 


(c)  Show  {Li,Lj}  = CijkLk  ■ The  following  identity  may  be  useful:  eijk^ilm  = SjiSkm  ~ 5jmSkl  ■ 

(d)  Show  { Li,L 2}  = 0. 

3.  Consider  the  Hamiltonian  of  a two-dimensional  harmonic  oscillator, 

H = ^l  + lm(U}lr2l+Uj22r I) 

What  condition  is  satisfied  if  L2  a conserved  quantity? 
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Problems 

1.  Consider  the  motion  of  a particle  of  mass  to  in  an  isotropic  harmonic  oscillator  potential  U = \kr2 and  take 
the  orbital  plane  to  be  the  x — y plane.  The  Hamiltonian  is  then 

H = So  = Px  +Py)  + 2 M®2  + V 2) 


Introduce  the  three  quantities 

51  = 2m^-pw)  + ^*2 

52  = -Pxpy  + kxy 

m 

53  = u(xpy  - ypx) 


with  uj  = y Use  Poisson  brackets  to  solve  the  following: 

a)  Show  that  [So,  Si]  = 0 for  i = 1,  2,  3 proving  that  (Si,  S2,  S3)  are  constants  of  motion. 

b)  Show  that 


[Si,S2]  = 2 uS3 
[S2,  S3]  = 2 wSi 
[S3,  Si]  = 2wS2 


so  that  (2 ui)  1 (Si,  S2,  S3)  have  the  same  Poisson  bracket  relations  as  the  components  of  a 3-dimensional  angular 
momentum, 

c)  Show  that 

S02  = S2  + S2  + S32 


2.  Assume  that  the  transformation  equations  between  the  two  sets  of  coordinates  (q,p)  and  ( Q , P)  are 

Q = ln(^ 
q 

P = q cot  p 

a)  Assuming  that  q,p  are  canonical  variables,  i.e.  [q,p\  = 1,  show  directly  from  the  above  transformation 
equations  that  Q,  P are  canonical  variables. 

(b)  Show  that 

pdq  — PdQ  = d(pq  + q cot  p) 

c)  Find  the  explicit  generating  function  Fi(q,  Q)  that  generates  this  transformation  between  these  two  sets  of 
canonical  variables.  Note  the  integral  f sin-1  xdx  = \/\  — x1  + a;  sin-1  x 

3.  Consider  the  uniform  motion  of  a free  particle  of  mass  TO.  The  Hamiltonian  is  a constant  of  motion  and  so  is 
the  function 

F(x,p,  t)  = X 

m 

(a)  Compare  the  Poisson  bracket  [H,  .F]  with  ^ and  prove  that  F is  a constant  of  motion. 

(b)  Prove  that  the  Poisson  bracket  of  two  constants  of  motion  is  itself  a constant  of  motion  even  if  the  constants 
F(x,p,t ) and  G(x,p,t ) depend  explicitly  on  time. 

(c)  Show  in  general  that  if  the  Hamiltonian  and  the  quantity  F are  constants  of  motion,  then  also  is  a constant 
of  motion. 


4 (a)  Solve  the  Hamilton- Jacobi  equation  for  the  generating  S{q,a,t)  for  a single  particle  moving  under  the 
Hamiltonian  H = \p2 . Find  the  canonical  transformation  q = q(/3,a), and  p = p(f3,a)  where  (5  and  a are 
the  transformed  coordinate  and  momentum  respectively.  Interpret  your  result. 

(b)  If  there  is  a perturbing  Hamiltonian  AH  = ^ q 2 , then  a no  longer  will  be  constant.  Express  the  transformed 
Hamiltonian  K (using  the  same  transformation  found  in  part  (a))  in  terms  of  Ot,/3,  and  t Solve  for  /3(f)  and 
a(t)  and  show  that  the  perturbed  solution  q[/3(t),  p[/3(f),  cr(f)]  is  simple  harmonic. 


Chapter  15 


Analytical  formulations  for  continuous 
systems 

15.1  Introduction 

Lagrangian  and  Hamiltonian  mechanics  have  been  used  to  determine  the  equations  of  motion  for  discrete 
systems  having  a finite,  albeit  sometimes  large,  number  of  discrete  variables  qi  where  1 < i < n.  There 
are  important  classes  of  systems  where  it  is  more  convenient  to  treat  the  system  as  being  continuous.  For 
example,  the  interatomic  spacing  in  solids  is  a few  10_10to  which  is  negligible  compared  with  the  size  of 
typical  macroscopic,  three-dimensional  solid  objects.  As  a consequence,  for  wavelengths  much  greater  than 
the  atomic  spacing  in  solids,  it  is  useful  to  treat  macroscopic  crystalline  lattice  systems  as  continuous  three- 
dimensional  uniform  solids,  rather  than  as  three-dimensional  discrete  lattice  chains.  Fluid  and  gas  dynamics 
are  other  examples  of  continuous  mechanical  systems.  Another  important  class  of  continuous  systems  involves 
the  theory  of  fields,  such  as  electromagnetic  fields.  Lagrangian  and  Hamiltonian  mechanics  of  the  continua 
extend  classical  mechanics  into  the  advanced  topic  of  field  theory.  This  chapter  goes  beyond  the  scope  of  a 
typical  undergraduate  classical  mechanics  course  in  order  to  provide  a brief  glimpse  of  how  Lagrangian  and 
Hamiltonian  mechanics  underlie  advanced  and  important  aspects  of  the  mechanics  of  the  continua,  including 
field  theory. 


15.2  The  continuous  uniform  linear  chain 


The  Lagrangian  for  the  discrete  lattice  chain,  for  longitudinal  modes,  is  given  by  equation  12.76  to  be 

1 n+l 

L = 2 (m«i  ~ K fe-1  “ %')2) 

3 = 1 

where  the  n masses  are  attached  in  series  to  n + 1 identical  springs  of  length  d and  spring  constant  k.  Assume 
that  the  spring  has  a uniform  cross-section  area  A and  length  d .Then  each  spring  volume  element  At  = Ad 
has  a mass  to,  that  is,  the  volume  mass  density  p = or  to  = pAr.  Chapter  15.5.3  will  show  that  the 
spring  constant  k = ^4  where  E is  Young’s  modulus,  A is  the  cross  sectional  area  of  the  chain  element,  and 
d is  the  length  of  the  element.  Then  the  spring  constant  can  be  written  as  k = ^jpr-  Therefore  equation 
15.1  can  be  expressed  as  a sum  over  volume  elements  At  = Ad 


(15.1) 


(15.2) 


In  the  limit  that  n — > oo  and  the  spacing  d = dx  — > 0,  then  the  summation  in  equation  15.2  can  be  written 
as  a volume  integral  where  x = jd  is  the  distance  along  the  linear  chain  and  the  volume  element  At  — *•  0. 
Then  the  Lagrangian  can  be  written  as  the  integral  over  the  volume  element  dr  rather  than  a summation 
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over  At.  That  is, 


/ dq(x,t) 
\ dx 


(15.3) 


The  coordinate  q(t)  for  the  discrete  chain  has  become  a continuous  function  q(x,t)  for  the  uniform  chain. 
Thus  the  integral  form  of  the  Lagrangian  can  be  expressed  as 


L = 


f dq{x,t) 
\ dx 


Zdr 


where  the  function  £ is  called  the  Lagrangian  density  defined  by 


£ 


1 

2 


/ dq(x,t) 
\ dx 


(15.4) 


(15.5) 


The  variable  x in  the  Lagrangian  density  is  not  a generalized  coordinate;  it  only  serves  the  role  of  a continuous 
index  played  previously  by  the  index  j.  For  the  discrete  case,  each  value  of  j defined  a different  generalized 
coordinate  Now  for  each  value  of  x there  is  a continuous  function  q(x,t)  which  is  a function  of  both 
position  and  time. 

Lagrange’s  equations  of  motion  applied  to  the  continuous  Lagrangian  in  equation  15.4  gives 


dt 2 


-E 


dx2 


= 0 


(15.6) 


This  is  the  familiar  wave  equation  in  one  dimension  for  a longitudinal  wave  on  the  continuous  chain  with  a 
phase  velocity 


Vphase 


(15.7) 


The  continuous  linear  chain  also  can  exhibit  transverse  modes  which  have  a Lagrangian  density  were  the 
Young’s  modulus  E is  replaced  by  the  tension  t in  the  chain,  and  p is  replaced  by  the  linear  mass  density  p 
of  the  chain,  leading  to  a phase  velocity  for  a transverse  wave  vphaSe  = 


15.3  The  Lagrangian  density  formulation  for  continuous  systems 

15.3.1  One  spatial  dimension 

In  general  the  Lagrangian  density  can  be  a function  of  q,  Vq.  ^|,  x,  y,  z,  and  t.  It  is  of  interest  that  Hamilton’s 
principle  leads  to  a set  of  partial  differential  equations  of  motion,  based  on  the  Lagrangian  density,  that  are 
analogous  to  the  Lagrange  equations  of  motion  for  discrete  systems.  When  deriving  the  Lagrangian  equations 
of  motion  in  terms  of  the  Lagrangian  density  using  Hamilton’s  principle,  the  notation  is  simplified  if  the 
system  is  limited  to  one  spatial  coordinate  x.  In  addition,  it  is  convenient  to  use  the  compact  notation 
where  the  spatial  derivative  is  q'  = 4s  and  the  time  derivative  is  q = (g,  that  is,  where  the  one-dimensional 
Lagrangian  density  is  assumed  to  be  a function  2(q,q',q,x,t).  The  appearance  of  the  derivative  q'  = -^  as 
an  argument  of  the  Lagrange  density  is  a consequence  of  the  continuous  dependence  of  q on  x.  In  principle, 
higher-order  derivatives  could  occur  but  they  do  not  arise  in  most  problems  of  physical  interest. 

Assuming  that  the  one  spatial  dimension  is  x,  then  Hamilton’s  principle  of  least  action  can  be  expressed 
in  terms  of  the  Lagrangian  density  as 

nt  2 /*^2 

6S  = 5 L(q,q,t)dt  = 5 / I £,(q,q',q,  x,  t)dxdt  (15.8) 

J t±  J 1 1 J Xl 

Following  the  same  approach  used  in  chapter  5.2,  it  is  assumed  that  the  stationary  path  for  the  action 
integral  is  described  by  the  function  q(x,t).  Define  a neighboring  function  using  a parametric  representation 
q( x,  t:  e)  such  that  for  e = 0,  where  q = q(x,  t ) is  the  function  that  yields  the  stationary  action  integral  S. 
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Assume  that  an  infinitessimal  fraction  e of  a neighboring  function  77(2,! ) is  added  to  the  extremum  path 
q(x,t).  That  is,  assume 


q(x,t;  e) 
q\x,t;  e) 
q{x,t;  e) 


q(x,t)  + €T](x,t) 

dq{x,t\e)  dq(x,t)  drj(x,t ) 

dx  dx  dx 

dq(x,t',e)  dq(x,t ) dr](x,t) 

dt  dt  dt 


= q'(x,t)  + er]'(x,t) 
= q(x,t)  + ef](x,t) 


(15.9) 

(15.10) 

(15.11) 


where  it  is  assumed  that  both  the  extremum  function  q(x,t)  and  the  auxiliary  function  77(2,!)  are  well 
behaved  functions  of  x and  t,  with  continuous  first  derivatives,  and  that  rj(x,t)  = 0 at  (xi,ti)  and  (22,12) 
because,  for  all  possible  paths,  the  function  q(x,  l;e)  must  be  identical  with  q(x,t ) at  the  end  points  of  the 
path,  i.e.  77(21,11)  = 77(22,12)  = 0. 

A parametric  family  of  curves  5(e),  as  a function  of  the  admixture  coefficient  e,  is  described  by  the 
function 

/»t2  r&2 

5(e)  = / / Z(q{x,t',e),c/ {x,t\e),q{x,  l;e),2,  t)dxdt  (15.12) 

J t\  J X 1 

Then  Hamilton’s  principle  requires  that  the  action  integral  be  a stationary  function  value  for  e = 0,  that  is, 
5(e)  is  independent  of  e which  is  satisfied  if 


95(e) 

de 


U2  rX2  IdZdq  d£dq 

Jtl  Jx  1 V dQ  de  + de 


9£  dq'  \ 

dq1  de  J 


dxdt  = 0 


Equations  15.9, 15. 10, and  15.11  give  the  partial  differentials 


(15.13) 


dq 

~dl 

d([_ 

dq 

9e 

Integration  by  parts  in  both  the  2 and  t terms  i 
77(22,12)  = 0 at  both  end  points,  yields 


77(2,1) 

(15.14) 

7/(2, 1) 

(15.15) 

77(2, 1) 

(15.16) 

equation  15.13,  plus  using  the  fact  that  77(21,11)  = 


[ 1 2 9£  dq  , 

Jtl 

rX2  9£  dq'  , 


d_  / 9£^ 
dt\dqj 
d_  / 9£ 
dx  \ dq' 


dq 

I 

1 de 


(15.17) 

(15.18) 


Therefore  Hamilton’s  principle,  equation  15.13  becomes 


95(e)  _ ft2  rX2  r9£  _ d_  / 9£\ 

de  ~ Jtl  JX1  ldcl  dt\dq) 


d_  ( 9£\ 
dx  \9g' ) 


77(2,  t)dxdt  = 0 


(15.19) 


Since  the  auxiliary  function  77(2, 1)  is  arbitrary,  then  the  integrand  term  in  the  square  brackets  of  equation 
15.19  must  equal  zero.  That  is, 


9 / 9£  \ 9 / 9£  \ 9£ 

dt  \ dq  J + 92  \ dq'  J dq 


(15.20) 


Equation  15.20  gives  the  equations  of  motion  in  terms  of  the  Lagrangian  density  that  has  been  derived 
based  on  Hamilton’s  principle. 


15.3.2  Three  spatial  dimensions 

Equation  15.4  expresses  the  Lagrangian  as  an  integral  of  the  Lagrangian  density  over  a single  continuous 
index  q(x,t)  where  the  Lagrangian  density  is  a function  £(<7,  ||,2,1).  The  derivation  of  the  Lagrangian 
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equations  of  motion  in  terms  of  the  Lagrangian  density  for  three  spatial  dimensions  involves  the  straightfor- 
ward addition  of  the  y.  and  z coordinates.  That  is,  in  three  dimensions  the  vector  displacement  is  expressed 
by  the  vector  q (x,  y,  z,  t ) and  the  Lagrangian  density  is  related  to  the  Lagrangian  by  integration  over  three 
dimensions.  That  is,  they  are  related  by  the  equation 

L = j £(q,  V • q,  x,  y,  z,  t)dr  (15.21) 

where,  in  cartesian  coordinates,  the  volume  element  dr  = dxdydz.  The  Lagrangian  density  is  a function 
£(q,  (^,  V • q,  x,  y,  z,  t ) where  the  one  field  quantity  q(x,  t ) has  been  extended  to  a spatial  vector  q (x,  y,  z,  t ) 
and  the  spatial  derivatives  q'  have  been  transformed  into  V • q.  Applying  the  method  used  for  the  one- 
dimensional  spatial  system,  to  the  three-dimensional  system,  leads  to  the  following  set  of  equations  of  motion 


d_  ( dZ_\  d_(d2\  d_(ds\  d_  ( 9Z\  _ dZ  _ n 

Wt  vlj + s vW/ + u§7  + \¥J  ^ ~ 


(15.22) 


where  the  x,y,  z spatial  derivatives  have  been  written  explicitly  for  clarity. 

Note  that  the  equations  of  motion,  equation  15.22,  treat  the  spatial  and  time  coordinates  symmetrically. 
This  symmetry  between  space  and  time  is  unchanged  by  multiplying  the  spatial  and  time  coordinate  by 
arbitrary  numerical  factors.  This  suggests  the  possibility  of  introducing  a four-dimensional  coordinate  system 


- {ah  V,  z,  at} 


where  the  parameter  a is  freely  chosen.  Using  this  4-dimensional  formalism  allows  equation  15.22  to  be 
written  more  compactly  as 


V 


d 


dZ 

JEL 


(15.23) 


As  discussed  in  chapter  16,  relativistic  mechanics  treats  time  and  space  symmetrically,  that  is,  a four- 
dimensional vector  q ( x , y,  z,  t)  can  be  used  that  treats  time  and  the  three  spatial  dimensions  symmetrically 
and  equally.  This  four-dimensional  space-time  formulation  allows  the  first  four  terms  in  equation  15.22  to  be 
condensed  into  a single  term  which  illustrates  the  symmetry  underlying  equation  15.23.  If  the  Lagrangian 
density  is  Lorentz  invariant,  and  if  a = ic,  then  equation  15.23  is  covariant.  Thus  the  Lagrangian  density 
formulation  is  ideally  suited  to  the  development  of  relativistically  covariant  descriptions  of  fields. 


15.4  The  Hamiltonian  density  formulation  for  continuous  systems 


Chapter  15.3  illustrates,  in  general  terms,  how  field  theory  can  be  expressed  in  a Lagrangian  formulation 
via  use  of  the  Lagrange  density.  It  is  equally  possible  to  obtain  a Hamiltonian  formulation  for  continuous 
systems  analogous  to  that  obtained  for  discrete  systems.  As  summarized  in  chapter  14,  the  Hamiltonian 
and  Hamilton’s  canonical  equations  of  motion  are  related  directly  to  the  Lagrangian  by  use  of  a Legendre 
transformation.  The  Hamiltonian  is  defined  as  being 


(15.24) 


The  generalized  momentum  is  defined  to  be 


Pi  = 


dL_ 

den 


(15.25) 


Equation  (15.25)  allows  the  Hamiltonian  (15.24)  to  be  written  in  terms  of  the  conjugate  momenta  as 


H (qi,Pi,  t)  = ^ Piqi  - L(q.i,qi,t)  = ^ (ft%  - Li(qi,qut))  (15.26) 

i i 

where  the  Lagrangian  has  been  partitioned  into  the  terms  for  each  of  the  individual  coordinates,  that  is, 
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In  the  limit  that  the  coordinates  q,p  are  continuous,  then  the  summation  in  equation  15.26  can  be 
transformed  into  a volume  integral  over  the  Lagrangian  density  £.  In  addition,  a momentum  density  can  be 
represented  by  the  vector  field  7r  where 


(15.27) 


Then  the  obvious  definition  of  the  Hamiltonian  density  fj  is 


(7 r • q— £)  dr 


(15.28) 


where  the  Hamiltonian  density  is  defined  to  be 

Sj  — 7r  • q— £ 


(15.29) 


Unfortunately  the  Hamiltonian  density  formulation  does  not  treat  space  and  time  symmetrically  making 
it  more  difficult  to  develop  relativistically  covariant  descriptions  of  fields.  Hamilton’s  principle  can  be  used 
to  derive  the  Hamilton  equations  of  motion  in  terms  of  the  Hamiltonian  density  analogous  to  the  approach 
used  to  derive  the  Lagrangian  density  equations  of  motion.  As  described  in  Classical  Mechanics  2nd  edition 
by  Goldstein,  the  resultant  Hamilton  equations  of  motion  for  one  dimension  are 


df) 

fa  = q 

d?)  d dsj 

dq  dx  dq'  ^ 

fa  _ _fa 

dt  dt 


(15.30) 

(15.31) 

(15.32) 


Note  that  equation  15.31  differs  from  that  for  discontinuous  systems. 


15.5  Linear  elastic  solids 

Elasticity  is  a property  of  matter  where  the  atomic  forces  in  matter  act  to  restore  the  shape  of  a solid  when 
distorted  due  to  the  application  of  external  forces.  A perfectly  elastic  material  returns  to  its  original  shape 
if  the  external  force  producing  the  deformation  is  removed.  Materials  are  elastic  when  the  external  forces 
do  not  exceed  the  elastic  limit.  Above  the  elastic  limit,  solids  can  exhibit  plastic  flow  and  concomitant  heat 
dissipation.  Such  non-elastic  behavior  in  solids  occurs  when  they  are  subject  to  strong  external  forces. 

The  discussion  of  linear  systems,  in  chapters  3 and  12,  focussed  on  one  dimensional  systems,  such  as  the 
linear  chain,  where  the  transverse  rigidity  of  the  chain  was  ignored.  An  extension  of  the  one-dinrensional 
linear  chain  to  two-dimensional  membranes,  such  as  a drum  skin,  is  straightforward  if  the  membrane  is  thin 
enough  so  that  the  rigidity  of  the  membrane  can  be  ignored.  Elasticity  for  three-dimensional  solids  requires 
accounting  for  the  strong  elastic  forces  exerted  against  any  change  in  shape  in  addition  to  elastic  forces 
opposing  change  in  volume.  The  stiffness  of  solids  to  changes  in  shape,  or  volume,  is  best  represented  using 
the  concepts  of  stress  and  strain.  Forces  in  matter  can  be  divided  into  two  classes;  (1)  body  forces,  such  as 
gravity,  which  act  on  each  volume  element,  and  (2)  surface  forces  which  are  the  forces  that  act  on  both  sides 
of  any  infinitessimal  surface  element  inside  the  solid.  Surface  forces  can  have  components  along  the  normal 
to  the  infinitessimal  surface,  as  well  as  shear  components  in  the  plane  of  the  surface  element.  Typically  solids 
are  elastic  to  both  normal  and  shear  components  of  the  surface  forces  whereas  shear  forces  in  liquids  and 
gases  lead  to  fluid  flow  plus  viscous  forces  due  to  energy  dissipation. 

As  described  below,  the  forces  acting  on  an  infinitessimal  surface  element  are  best  expressed  in  terms  of 
the  stress  tensor,  while  the  relative  distortion  of  the  shape,  or  volume,  of  the  body  are  best  expressed  in 
terms  of  the  strain  tensor.  The  moduli  of  elasticity  relate  the  ratio  of  the  corresponding  stress  and  strain 
tensors.  The  moduli  of  elasticity  are  constant  in  linear  elastic  solids  and  thus  the  stress  is  proportional  to 
the  strain  providing  that  the  strains  do  not  exceed  the  elastic  limit. 
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15.5.1  Stress  tensor 

Consider  an  infinitessimal  surface  area  dA  of  an  arbitrary  closed  volume  element  dV  inside  the  medium. 
The  surface  area  element  is  defined  as  a vector  dA  = n dA  where  n is  the  outward  normal  to  the  closed 
surface  that  encloses  the  volume  element.  Assume  that  dF  is  the  force  element  exerted  by  the  outside  on 
the  material  inside  the  volume  element.  The  stress  tensor  T is  defined  as  the  ratio  of  dF  and  dA  where  the 
force  vector  dF  is  given  by  the  inner  product  of  the  stress  tensor  T and  the  surface  element  vector  dA.  That 
is, 

dF  = T-dA  (15.33) 

Since  both  dF  and  d A are  vectors,  then  equation  15.33  implies  that  the  stress  tensor  must  be  a second-rank 
tensor  as  described  in  appendix  E,  that  is,  the  stress  tensor  is  analogous  to  the  rotation  matrix  or  inertia 
tensor.  Note  that  if  dF  and  ndA  are  colinear,  then  the  stress  tensor  T reduces  to  the  conventional  pressure 
P.  The  general  stress  tensor  equals  the  momentum  flux  density  and  has  the  dimensions  of  pressure. 


15.5.2  Strain  tensor 


Forces  applied  to  a solid  body  can  lead  to  translational,  or  rotational  acceleration,  in  addition  to  changing 
the  shape  or  volume  of  the  body.  Elastic  forces  do  not  act  when  an  overall  displacement  £ of  an  infinitessimal 
volume  occurs,  such  as  is  involved  in  translational  or  rotational  motion.  Elastic  forces  act  to  oppose  position- 
dependent  differences  in  the  displacement  vector  £,  that  is,  the  strain  depends  on  the  tensor  product  V ® £. 
For  an  elastic  medium  the  strain  depends  only  on  the  applied  stress  and  not  on  the  prior  loading  history. 

Consider  that  the  matter  at  the  location  r is  subject  to  an  elastic  displacement  £,  and  similarly  at  a 
displaced  location  r'  = r+  §§-dxi  where  are  cartesian  coordinates.  The  net  relative  displacement 
between  r and  r'  is  given  by 


d£2=  Y (dxi  + dQ2  - Y C dxi  f = Y 


ik 


d£i  , dZk 

dxk  dxi 


d£m 


dxi  dxk 


dxidxk 


(15.34) 


Ignoring  the  second  order  term  gives  that  the  ith 


component  of  the  relative  displacement  to  be 


k x 7 

(15.35) 

Define  the  elements  of  the  strain  tensor  to  be  given  by 

lk  2 \dxk  dxi ) 

(15.36) 

then 

d£i  = Y Vikdxidxk 

(15.37) 

fc 


Thus  the  strain  tensor  er  is  a rank-2  tensor  defined  as  the  ratio  of  the  strain  vector  £ and  the  infinitessimal 
area  vector  dA. 


d£  = er-dA 

where  the  component  form  of  the  rank  -2  strain  tensor  is 


dj  i 

dj  i 

dx± 

dx  2 

d£ 2 

dx  i 

dx  2 

i 

di 3 

dx  i 

dx  2 

<%i 

dx  3 
2 

dx  3 

dL± 

dx  3 


(15.38) 


(15.39) 


The  potential-energy  density  for  linear  elastic  forces  is  quadratic  in  the  strain  components.  That  is,  it  is 
of  the  form 

u = Y 9 Cijki<JijO-ki  (15.40) 

ijkl 

where  Ciju  is  a rank-4  tensor.  No  preferential  directions  remain  for  a homogeneous  isotropic  elastic  body 
which  allows  for  two  contractions,  thereby  reducing  the  potential  energy  density  to  the  inner  product 

U = Y lDik  (<Uk)2 

ik 


(15.41) 
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15.5.3  Moduli  of  elasticity 

The  modulus  of  elasticity  of  a body  is  defined  to  be  the  slope  of  the  stress-strain  curve  and  thus,  in 
principle,  it  is  a complicated  rank-4  tensor  that  characterizes  the  elastic  properties  of  a material.  Thus  the 
general  theory  of  elasticity  is  complicated  because  the  elastic  properties  depend  on  the  orientation  of  the 
microscopic  composition  of  the  elastic  matter.  The  theory  simplifies  considerably  for  homogeneous,  isotropic 
linear  materials  below  the  elastic  limit,  where  the  strain  is  proportional  to  the  applied  stress.  That  is,  the 
modulus  of  elasticity  then  reduces  by  contractions  to  a constant  scalar  value  that  depends  on  the  properties 
of  the  matter  involved. 

The  potential  energy  density  for  homogeneous,  isotropic,  linear  material,  equation  15.41,  can  be  separated 
into  diagonal  and  off-diagonal  components  of  the  strain  tensor.  That  is, 


A ^2  (an)2  + 2/j,  '^2  (vikf 


(15.42) 


The  diagonal  first  term  is  the  dilation  term  which  corresponds  to  changes  in  the  volume  with  no  changes 
in  shape.  The  off-diagonal  second  term  involves  the  shear  terms  that  correspond  to  changes  of  the  shape  of 
the  body  that  also  changes  the  volume.  The  constants  A and  /x  are  Lame’s  moduli  of  elasticity  which  are 
positive.  Various  moduli  of  elasticity,  corresponding  to  different  distortions  in  the  shape  and  volume  of  any 
solid  body,  can  be  derived  from  Lame’s  moduli  for  the  material. 

The  components  of  the  elastic  forces  can  be  derived  from  the  gradient  of  the  elastic  potential  energy, 
equation  15.42  by  use  of  Gauss’  law  plus  vector  differential  calculus.  The  components  of  the  elastic  force, 
derived  from  the  strain  tensor  er,  can  be  associated  with  the  corresponding  components  of  the  stress  tensor 
T.  Thus,  for  homogeneous  isotropic  linear  materials,  the  components  of  the  stress  tensor  are  related  to  the 
strain  tensor  by  the  relation 


T-  ■ — 


A Sij  'y  ' akk  + 2//rr t 


(15.43) 


where  it  has  been  assumed  that  try  = aji.  The  two  moduli  of  elasticity  A and  /x  are  material-dependent 
constants.  Equation  15.43  can  be  written  in  tensor  notation  as 


T = Afr(xx)I  + 2/x<r 


(15.44) 


where  tr(a)  is  the  trace  of  the  strain  tensor  and  I is  the  identity  matrix. 

Equation  15.44  can  be  inverted  to  give  the  strain  tensor  components  in  terms  of  the  stress  tensor  com- 
ponents. 

1 

(Jin  — 

3 2/x 

The  various  moduli  of  elasticity  relate  combinations  of  different  stress  and  strain  tensor  components.  The 
following  five  elastic  moduli  are  used  frequently  to  describe  elasticity  in  homogeneous  isotropic  media,  and 
all  are  related  to  Lame’s  two  moduli  of  elasticity. 

1)  Young’s  modulus  E describes  tensile  elasticity  which  is  axial  stiffness  of  the  length  of  a body  to 
deformation  along  the  axis  of  the  applied  tensile  force. 


T- 


A 


(3A  + 2/i) 


Tkkdi 


(15.45) 


Tn  M (3A  + 2 fj,) 
a n (A  + /x) 


(15.46) 


2)  Bulk  modulus  B = defines  the  relative  dilation  or  compression  of  a bodies  volume  to  pressure 
applied  uniformly  in  all  directions. 

B = A+^m  (15-47) 

O 

The  bulk  modulus  is  an  extension  of  Young’s  modulus  to  three  dimensions  and  typically  is  larger  than  E. 
The  inverse  of  the  bulk  modulus  is  called  the  compressibility  of  the  material. 
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3)  Shear  modulus  G describes  the  shear  stiffness  of  a body  to  volume-preserving  shear  deformations. 
The  shear  strain  a becomes  a deformation  angle  given  by  the  ratio  of  the  displacement  along  the  axis  of  the 
shear  force  and  the  perpendicular  moment  arm.  The  shear  modulus  G equals  Lame’s  constant  p.  That  is, 

G = ft  (15.48) 


4)  Poisson’s  ratio  v is  the  negative  ratio  of  the  transverse  to  axial  strain.  It  is  a measure  of  the  volume 
conserving  tendency  of  a body  to  contract  in  the  directions  perpendicular  to  the  axis  along  which  it  is 
stretched.  In  terms  of  Lame’s  constants,  Poisson’s  ratio  equals 


A 

2 (A  + //) 


(15.49) 


Note  that  for  a stable,  isotropic  elastic  material,  Poisson’s  ratio  is  bounded  between  —1.0  <v<  0.5  to  ensure 
that  the  J3,  fi  and  A moduli  have  positive  values.  At  the  incompressible  limit,  v = 0.5,  and  the  bulk  modulus 
and  Lame  parameter  A are  infinite,  that  is,  the  compressibility  is  zero.  Typical  solids  have  Poisson’s  ratios 
of  v s=s  0.05  if  hard  and  v = 0.25  if  soft. 

The  stiffness  of  elastic  solids  in  terms  of  the  elastic  moduli  of  solids  can  be  complicated  due  to  the 
geometry  and  composition  of  solid  bodies.  Often  it  is  more  convenient  to  express  the  stiffness  in  terms  of 

the  spring  constant  k where 

dF 

n=—  (15.50) 

dx 

The  spring  constant  is  inversely  proportional  to  the  length  of  the  spring  because  the  strain  of  the  material 
is  defined  to  be  the  fractional  deformation,  not  the  absolute  deformation. 


15.5.4  Equations  of  motion  in  a uniform  elastic  media 


The  divergence  theorem  (H. 8)  relates  the  volume  integral  of  the  divergence  of  T to  the  vector  force  density 
F acting  on  the  closed  surface. 


T-dA  = / V • T dr 


f dr 


(15.51) 


That  is,  the  inner  product  of  the  del  operator,  V,  and  the  rank-2  stress  tensor  T,  give  the  vector  force 
density  f.  This  force  acting  on  the  enclosed  mass  (j)  pdr,  for  the  closed  volume,  leads  to  an  acceleration 


Thus 


T dr  = 


(15.52) 


Use  equation  15.44  to  relate  the  stress  tensor  T to  the  moduli  of  elasticity  gives 


E 


(A  + fi) 


9% 

dxidxj 


+ n 


°x2j 


(15.53) 


where  i = 1,2, 3.  In  general  this  equation  is  difficult  to  solve.  However,  for  the  simple  case  of  a plane  wave 
in  the  i = 1 direction,  the  problem  reduces  to  the  following  three  equations 


(A  -t-  2/z) 


d 2gi 

dx  \ 


d2£ 

dx\ 


d2£ 

dx\ 


(15.54) 

(15.55) 

(15.56) 


Equation  15.54  corresponds  to  a longitudinal  wave  travelling  with  velocity  v = ^(A+2^) . Equations 
15.55, 15.56  correspond  to  two  perpendicular  transverse  waves  travelling  with  velocity  v = \[^>-  This  il- 
lustrates the  important  fact  that  longitudinal  waves  travel  faster  than  transverse  waves  in  an  elastic  solid. 
Seismic  waves  in  the  Earth,  generated  by  earthquakes,  exhibit  this  property.  Note  that  shearing  stresses  do 
not  exist  in  ideal  liquids  and  gases  since  they  cannot  maintain  shear  forces  and  thus  [i  = 0. 
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15.6  Electromagnetic  field  theory 

15.6.1  Maxwell  stress  tensor 

Analytical  formulations  for  continuous  systems,  developed  for  describing  elasticity,  are  generally  applicable 
when  applied  to  other  fields,  such  as  the  electromagnetic  field.  The  use  of  the  Maxwell’s  stress  tensor  T,  to 
describe  momentum  in  the  electromagnetic  field,  is  an  important  example  of  the  application  of  continuum 
mechanics  in  field  theory. 

The  Lorentz  force  can  be  written  as 


F = J p (E  + v x B)  dr  = J (pE  + J x B)  dr  = J fdr 
where  the  force  density  f is  defined  to  be 

f = (pE  + J x B) 

Maxwell’s  equations 


p = e0V  ■ E 

can  be  used  to  eliminate  the  charge  and  current  densities  in  equation  15.57 


t 1 „ „ 3E 

J = — V x B e0-^— 
Mo  ot 


Vector  calculus  gives  that 
while  Faraday’s  law  gives 


1 N 

f =e0  (V  ■ E)  E + ( — V x B - e0—  ) xB 

,Mo 


d /T,  <9E  „ „ dB 

Si<ExB)=  a xB  + Exar 


SB  — - 

ar  = -VxE 


Equation  15.62  allows  equation  15.61  to  be  rewritten  as 


d 


— x B = +-  (E  x B)  - Ex  — = +2-  (E  x B)  + Ex  (V  x E) 
at  at  at  at 


(15.57) 

(15.58) 

(15.59) 

(15.60) 

(15.61) 

(15.62) 

(15.63) 


Equation  15.63  can  be  inserted  in  equation  15.60.  In  addition,  a term  A-  (V  ■ B)  B can  be  added  since 
V ■ B =0  which  allows  equation  15.60  to  be  written  in  the  symmetric  form 


1 1 <9E 

f = e0(V'E)E  + -(V-B)B+-(VxB)xB-e0TxB 
Mo  Mo 

= e0  (V  • E)  E + — (V  ■ B)  B+—  (V  x B)  x B-eo^-  (E  x B)  — e0Ex  (V  x E) 
Mo  Mo 

Using  the  vector  identity 

V (A  ■ B)  = Ax  (V  x B)  + Bx  (V  x A)  + (A  ■ V)  B+  (B  ■ V)  A 

Let  A = B = E.  then 


That  is 


Similarly 


V (E2)  = 2Ex  (V  x E)  + 2 (E  ■ V)  E 


Ex  (V  x E)  = -V  (E2)  - (E  • V)  E 


Bx  (V  x B)  = -V  (B2)  - (B  ■ V)  B 


Inserting  equations  15.68  and  15.69  into  equation  15.65  gives 


f=eo 


(V  • E)  E+  (E  • V)  E-^VS2 


1 

Mo 


(V  -B)B+(B  ■ V)  B--VB2 


^(ExB) 


(15.64) 

(15.65) 

(15.66) 

(15.67) 

(15.68) 

(15.69) 

(15.70) 
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This  complicated  formula  can  be  simplified  by  defining  the  rank-2  Maxwell  stress  tensor  T which  has 
components 

Tij  = e0  i^EiEj  - -SijE2^  + — ^ BjBj  - -SijB^j  (15.71) 

The  inner  product  of  the  del  operator  and  the  Maxwell  stress  tensor  is  a vector  with  j components  of 


(V-T),=e0 


(V  • E)  Ej+  (E  • V)  Ej  — \^2,E 


1 

d-o 


(V  ■ B)  Bj 


• (B  • V)  Bj  — -V'B2 


(15.72) 


The  above  definition  of  the  Maxwell  stress  tensor,  plus  the  Poynting  vector  S =-E  (E  x B) , allows  the  force 
density  equation  15.58  to  be  written  in  the  form 

dS 

f = V • T—eofiQ—  (15.73) 

The  divergence  theorem  allows  the  total  force,  acting  of  the  volume  r,  to  be  written  in  the  form 


F 


„ m ds\  , 

V ■ dr 


T-da— e0/.t0—  / SdT 


(15.74) 

(15.75) 


Note  that,  if  the  Poynting  vector  is  time  independent,  then  the  second  term  in  equation  15.75  is  zero  and  the 
Maxwell  stress  tensor  T is  the  force  per  unit  area,  (stress)  acting  on  the  surface.  The  fact  that  T is  a rank-2 
tensor  is  apparent  since  the  stress  represents  the  ratio  of  the  force-density  vector  df  and  the  infinitessinral 
area  vector  da,  which  do  not  necessarily  point  in  the  same  directions. 


15.6.2  Momentum  in  the  electromagnetic  field 

Chapter  7.2  showed  that  the  electromagnetic  field  carries  a linear  momentum  qA  where  q is  the  charge  on  a 
body  and  A is  the  electromagnetic  vector  potential.  It  is  useful  to  use  the  Maxwell  stress  tensor  to  express 
the  momentum  density  directly  in  terms  of  the  electric  and  magnetic  fields. 

Newton’s  law  of  motion  can  be  used  to  write  equation  equation  15.75  as 

F=dp^=yT(ia_eo(io|/sdT  (15-76) 

where  p is  the  total  mechanical  linear  momentum  of  the  volume  r.  Equation  15.76  implies  that  the  electro- 
magnetic field  carries  a linear  momentum 

P field.  = todo  f Sdr  (15.77) 


The  (j)T-da  term  in  equation  15.76  is  the  momentum  per  unit  time  flowing  into  the  closed  surface. 

In  field  theory  it  can  be  useful  to  describe  the  behavior  in  terms  of  the  momentum  flux  density  tv.  Thus 
the  momentum  flux  density  tv in  the  electromagnetic  Held  is 


tv  fieid=todoS  (15.78) 

Then  equation  15.76  implies  that  the  total  momentum  flux  density  tv  = tv  mech+ tv  field  is  related  to  Maxwell’s 
stress  tensor  by 

d 

■7^  (' rVmech  + TV  field)  = V • T (15.79) 

That  is,  like  the  elasticity  stress  tensor,  the  divergence  of  Maxwell’s  stress  tensor  T equals  the  rate  of  change 
of  the  total  momentum  density,  that  is,  — T is  the  momentum  flux  density. 

This  discussion  of  the  Maxwell  stress  tensor  and  its  relation  to  momentum  in  the  electromagnetic  field 
illustrates  the  role  that  analytical  formulations  of  classical  mechanics  can  play  in  field  theory. 


15.7.  IDEAL  FLUID  DYNAMICS 


447 


15.7  Ideal  fluid  dynamics 

The  distinction  between  a solid  and  a fluid  is  that  a fluid  flows  under  shear  stress  whereas  the  elasticity 
of  solids  oppose  distortion  and  flow.  Shear  stress  in  a fluid  is  opposed  by  dissipative  viscous  forces,  which 
depend  on  velocity,  as  opposed  to  elastic  solids  where  the  shear  stress  is  opposed  by  the  elastic  forces  which 
depend  on  the  displacement.  An  ideal  fluid  is  one  where  the  viscous  forces  are  negligible,  and  thus  the  shear 
stress  Lame  parameter  p = 0. 

15.7.1  Continuity  equation 

Fluid  dynamics  requires  a different  philosophical  approach  than  that  used  to  describe  the  motion  of  an 
ensemble  of  known  solid  bodies. The  prior  discussions  of  classical  mechanics  used,  as  variables,  the  coordinates 
of  each  member  of  an  ensemble  of  particles  with  known  masses.  This  approach  is  not  viable  for  fluids 
which  involve  an  enormous  number  of  individual  atoms  as  the  fundamental  bodies  of  the  fluid.  The  best 
philosophical  approach  for  describing  fluid  dynamics  is  to  employ  continuum  mechanics  using  definite  fixed 
volume  elements  dr  and  describe  the  fluid  in  terms  of  macroscopic  variables  of  the  fluid  such  as  mass  density 
p,  pressure  P,  and  fluid  velocity  v. 

Conservation  of  fluid  mass  requires  that  the  rate  of  change  of  mass  in  a fixed  volume  must  equal  the  net 
inflow  of  mass. 

-j-  [ pdr  + </pv-da  = 0 (15.80) 


(15.81) 


Using  the  divergence  theorem  (id 2)  allows  this  to  be  written  as 


— + V-  (pv)  I dr  = 0 


Mass  conservation  must  hold  for  any  arbitrary  volume,  therefore  the  continuity  equation  can  be  written  in 
the  differential  form 

% + V-  (pv)  = 0 (15.82) 


15.7.2  Euler’s  hydrodynamic  equation 

The  fluid  surrounding  a volume  r exerts  a net  force  F that  equals  the  surface  integral  of  the  pressure  P. 
This  force  can  be  transformed  to  a volume  integral  of  VP. The  net  force  then  will  lead  to  an  acceleration  of 
the  volume  element.  That  is 

F = -I Pda  = - / VPdr  = / p^dr  (15.83) 


Thus  the  force  density  f is  given  by 


f = —VP  =p ^ 

p dt 


(15.84) 


Note  that  the  acceleration  ^ in  equation  15.83  refers  to  the  rate  of  change  of  velocity  for  individual 
atoms  in  the  fluid,  not  the  rate  of  change  of  fluid  velocity  at  a fixed  point  in  space.  These  two  accelerations 
are  related  by  noting  that,  during  the  time  dt,  the  change  in  velocity  dv  of  a given  fluid  particle  is  composed 
of  two  parts,  namely  (1)  the  change  during  dt  in  the  velocity  at  a fixed  point  in  space,  and  (2)  the  difference 
between  the  velocities  at  that  same  instant  in  time  at  two  points  displaced  a distance  dr  apart,  where  dr  is 
the  distance  moved  by  a given  fluid  particle  during  the  time  dt.  The  first  part  is  given  by  ^dt  at  a given 
point  ( x,y,z ) in  space.  The  second  part  equals 


dv  dv 

dydy  + <<- ^7  = (*■  v) v 


(15.85) 


dv  = — dt  + (dr  ■ V)  v 

Divide  both  sides  by  dt  gives  that  the  acceleration  of  the  atoms  in  the  fluid  equals 

dv  <9v  _ . 

dt  dt  y ’ 


(15.86) 


(15.87) 
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Substitute  equation  15.87  into  15.84  gives 

^ + (v-  V)v  = --^VP  (15.88) 

This  is  Euler’s  equation  for  hydrodynamics.  The  two  terms  on  the  left  represent  the  acceleration  in  the 
individual  fluid  components  while  the  right-hand  side  lists  the  force  density  producing  the  acceleration. 

Additional  forces  can  be  added  to  the  right-hand  side.  For  example,  the  gravitational  force  density  pg 
can  be  expressed  in  terms  of  the  gravitational  scalar  potential  V to  be 

pg  = — pW  (15.89) 

Inclusion  of  the  gravitational  field  force  density  in  Euler’s  equation  gives 

^ + (v-  V)v  = -iv(P  + pE)  (15.90) 

15.7.3  Irrotational  flow  and  Bernoulli’s  equation 

Streamlined  flow  corresponds  to  irrotational  flow,  that  is,  V x v = 0.  Since  irrotational  flow  is  curl  free,  the 
velocity  streamlines  can  be  represented  by  a scalar  potential  field  4>.  That  is 

v=-V<?i  (15.91) 

This  scalar  potential  field  (f>  can  be  used  to  derive  the  vector  velocity  field  for  irrotational  flow. 

Note  that  the  (v  • V)  v term  in  Euler’s  equation  (15.90)  can  be  rewritten  using  the  vector  identity 

(v  • V)v  =^V  (v2)  - v x V x v (15.92) 

Inserting  equation  15.92  into  Euler’s  equation  15.90  then  gives. 

v x V x v--V  (\pv2+P  + pV ) (15.93) 

O L p \ Z,  J 

Potential  flow  corresponds  to  time  independent  irrotational  flow,  that  is,  both  = 0 and  V x v = 0.  For 
potential  flow  equation  15.93  reduces  to 


V ( -pv  + P + pV  ) =0 


which  implies  that 


1 2 . 

-pv  + P + pV  ) = constant 


(15.94) 


This  is  the  famous  Bernoulli’s  equation  that  relates  the  interplay  of  the  fluid  velocity,  pressure  and  gravita- 
tional energy.  Bernoulli’s  equation  plays  important  roles  in  both  hydrodynamics  and  aerodynamics. 


15.7.4  Gas  flow 

Fluid  dynamics  applied  to  gases  is  a straightforward  extension  of  fluid  dynamics  that  employs  standard  ther- 
modynamical concepts.  The  following  example  illustrates  the  application  of  fluid  mechanics  for  calculating 
the  velocity  of  sound  in  a gas. 
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15.1  Example:  Acoustic  waves  in  a gas 


Propagation  of  acoustic  waves  in  a gas  provides  an  example  of  using  the  three-dimensional  Lagrange 
density.  Only  longitudinal  waves  occur  in  a gas  and  the  velocity  is  given  by  thermodynamics  of  the  gas.  Let 
the  displacement  of  each  gas  molecule  be  designated  by  the  general  coordinate  q with  corresponding  velocity 
q.  Let  the  gas  density  be  p,  then  the  kinetic  energy  density  ( KED ) of  an  infinitessimal  volume  of  gas  At  is 
given  by 

a (KED)  = -2p0e 

The  rapid  contractions  and  expansions  of  the  gas  in  an  acoustic  wave  occur  adiabatically  such  that  the  product 
PV~i  is  a constant,  where  7 = ^ ^Tat  aa\  pvTm 7 • Therefore  the  change  in  potential  energy  density 

A (PED)  is  given  to  second  order  by 


1 


A (PED)  = — / 
T0  Jv0 


rVo+AV  p 1 f dP 

PdT  = ££At+  * ) (At)*  = ^Ar  - — ( 7^  ) (Ar) 


T 0 


Since  the  volume  and  density  are  related  by 


2 r0  \dr 
M 


to 


2 r0 


TO 


To  = 


Po 


then  the  fractional  change  in  the  density  a is  related  to  the  density  by 

P = Po(l  + <r) 

This  implies  that  the  potential  energy  density  (PED)  is  given  by 


u , A)  2 


A (PED)  = 

The  mass  flowing  out  of  the  volume  Vq  must  equal  the  fractional  change  in  density  of  the  volume,  that  is 


Po  / q ' dS  = Po  / GdT 


The  divergence  theorem  gives  that 


q • dS 


V • q d,T  = 


-I 


a dr 


Thus  the  density  a is  given  by  minus  the  divergence  of  q 

a = —V  • q 

This  allows  the  potential  energy  density  to  be  written  as 

A (PED)  = -P0V  • q+-^7j—  (V  • q)2 

Combining  the  kinetic  energy  density  and  the  potential  energy  density  gives  the  complete  Lagrangian  density 
for  an  acoustic  wave  in  a gas  to  be 

£=^oq2  + ^oV-q-^(V-q)2 

Inserting  this  Lagrangian  density  in  the  corresponding  equations  of  motion,  equation  15.23,  gives  that 


2„  Po  rf2q 


V-'q 


1P0  dt2 


= 0 


where  P0  and  p0  are  the  ambient  pressure  and  density  of  the  gas.  This  is  the  wave  equation  where  the  phase 
velocity  of  sound  is  given  by 


Vphase 


IjPo 

Po 
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15.8  Viscous  fluid  dynamics 

Viscous  fluid  dynamics  is  a branch  of  classical  mechanics  that  plays  a pivotal  role  in  a wide  range  of  aspects 
of  life,  such  as  blood  flow  in  human  anatomy,  weather,  hydraulic  engineering,  and  transportation  by  land, 
sea,  and  air.  Viscous  fluid  flow  provides  natures  most  common  manifestation  of  nonlinearity  and  turbulence 
in  classical  mechanics,  and  provides  an  excellent  illustration  of  possible  solutions  of  non-linear  equations  of 
motion  introduced  in  chapter  4.  A detailed  description  of  turbulence  remains  a challenging  problem  and 
this  subject  has  the  reputation  of  being  the  last  great  unsolved  problem  in  classical  mechanics.  There  is 
an  apocryphal  story  that  Werner  Heisenberg  was  asked,  if  given  the  opportunity,  what  would  he  like  to  ask 
God.  His  reply  was  "When  I meet  God,  I am  going  to  ask  him  two  questions:  Why  relativity?  and  why 
turbulence?,  I really  believe  he  will  only  have  an  answer  to  the  first". 

In  contrast  to  solids,  fluids  do  not  have  elastic  restoring  forces  to  support  shear  stress  because  the  fluid 
flows.  Shear  stresses  in  fluids  are  balance  by  viscous  forces  which  are  velocity  dependent.  There  are  two 
mechanisms  that  lead  to  shear  stress  acting  between  adjacent  fluid  layers  in  relative  motion.  The  first 
mechanism  involves  laminar  flow  where  the  viscous  forces  produce  shear  stress  between  adjacent  layers  of 
the  fluid  which  are  moving  parallel  along  adjacent  streamlines  at  differing  velocities.  Viscous  forces  typically 
dominate  laminar  flow.  High  viscosity  fluids  like  honey  exhibit  laminar  flow  and  are  more  difficult  to  stir 
or  pour  compared  with  low-viscosity  fluids  like  water.  The  second  mechanism  involves  turbulent  flow  where 
shear  stress  is  due  to  momentum  transfer  between  adjacent  layers  when  the  flow  breaks  up  into  large-scale 
coherent  vortex  structures  which  carry  most  of  the  kinetic  energy.  These  eddies  lead  to  transverse  motion 
that  transfers  momentum  plus  heat  between  adjacent  layers  and  leads  to  higher  drag.  The  wing-tip  vortex 
produced  by  the  wing  tip  of  an  aircraft  is  an  example  of  a dynamically-distinct,  large-scale,  coherent  vortex 
structure  which  has  considerable  angular  momentum  and  decays  by  fragmentation  into  a cascade  of  smaller 
scale  structures. 


15.8.1  Navier-Stokes  equation 

Viscous  forces  acting  on  the  small-scale  coherent  structures  eventually  dissipate  the  energy  in  turbulent 
motion.  The  viscous  drag  can  be  handled  in  terms  of  a stress  tensor  T analogous  to  its  use  when  accounting 
for  the  elastic  restoring  forces  in  elasticity  as  discussed  in  chapter  15.5.3.  That  is,  the  viscous  force  density 
is  related  to  the  deceleration  of  the  volume  element  by 

^ (pv)  = — V ■ T (15.95) 

where  the  components  of  the  stress  tensor  are 

Tm  = Tik  = P5ik  + pvivk  (15.96) 


Note  that  the  stress  tensor  gives  the  momentum  flux  density  tensor,  which  involves  a diagonal  term  propor- 
tional to  pressure  P,  plus  a viscous  drag  term  that  is  is  proportional  to  the  product  of  two  velocities. 

The  Navier-Stokes  equations  are  the  fundamental  equations  characterizing  fluid  flow.  They  are  based  on 
application  of  Newton’s  second  law  of  motion  to  fluids  together  with  the  assumption  that  the  fluid  stress 
is  the  sum  of  a diffusing  viscous  term  plus  a pressure  term.  Combining  Euler’s  equation,  15.90,  with  15.95 
gives  the  Navier-Stokes  equation 


P 


dv 

^+V'VV 


= -VP  + V-T+f 


(15.97) 


where  p is  the  fluid  density,  v is  the  flow  velocity  vector,  P the  pressure,  T is  the  shear  stress  tensor  viscous 
drag  term,  and  f represents  external  body  forces  per  unit  volume  such  as  gravity  acting  on  the  fluid. 

For  incompressible  flow  the  stress  tensor  term  simplifies  to  V ■ T =/uV2v.  Then  the  Navier-Stokes 
equation  simplifies  to 

= -VP  + /iV2v+ f (15.98) 

where  MV2v  is  the  viscosity  drag  term.  The  left-hand  side  of  equation  15.98  represents  the  rate  of  change 
of  momentum  per  unit  volume  while  the  right-hand  side  represents  the  summation  of  the  forces  per  unit 
volume  that  are  acting. 


dv 

aT  + v'Vv 
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The  Navier-Stokes  equations  are  nonlinear  due  to  the  (v-V)v  term  as  well  as  being  a function  of 
velocity.  This  non-linearity  leads  to  a wide  spectrum  of  dynamic  behavior  ranging  from  ordered  laminar 
flow  to  chaotic  turbulence.  Numerical  solution  of  the  Navier-Stokes  equations  is  extremely  difficult  because 
of  the  wide  dynamic  range  of  the  dimensions  of  the  coherent  structures  involved  in  turbulent  motion.  For 
example,  simulation  calculations  require  use  of  a high  resolution  mesh  which  is  a challenge  to  the  capabilities 
of  current  generation  computers. 

The  microscopic  boundary  condition  at  the  interface  of  the  solid  and  fluid  is  that  the  fluid  molecules 
have  zero  average  tangential  velocity  relative  to  the  normal  to  the  solid-fluid  interface.  This  implies  that 
there  is  a boundary  layer  for  which  there  is  a gradient  in  the  tangential  velocity  of  the  fluid  between  the 
solid-fluid  interface  and  the  free-steam  velocity.  This  velocity  gradient  produces  vorticity  in  the  fluid.  When 
the  viscous  forces  are  negligible  then  the  angular  momentum  in  any  coherent  vortex  structure  is  conserved 
leading  to  the  vortex  motion  being  preserved  as  it  propagates. 


15.8.2  Reynolds  number 


Fluid  flow  can  be  characterized  by  the  Reynolds  number 
Re  which  is  a dimensionless  number  that  is  a measure 
of  the  ratio  of  the  inertial  forces  pv2 /L  to  viscous  forces 
pv/L2.  That  is, 


Re  = 


Inertial  forces 
Viscous  forces 


pvL 


vL 

V 


(15.99) 


where  v is  the  relative  velocity  between  the  free  fluid 
flow  and  the  solid  surface,  L is  a characteristic  linear 
dimension,  p is  the  dynamic  viscosity  of  the  fluid,  rj  is 
the  kinematic  viscosity  (r?  = ^),  and  p is  the  density 
of  the  fluid.  The  Law  of  Similarity  implies  that  at  a 
given  Reynolds  number,  for  a specific  shaped  solid  body, 
the  fluid  flow  behaves  identically  independent  of  the  size 
of  the  body.  Thus  one  can  use  small  models  in  wind 
tunnels,  or  water-flow  tanks,  to  accurately  model  fluid 
flow  that  can  be  scaled  up  to  a full-sized  aircraft  or  boats 
by  scaling  v and  L to  give  the  same  Reynolds  number. 


15.8.3  Laminar  and  turbulent  fluid  flow 

Fluid  flow  over  a cylinder  illustrates  the  general  features 
of  fluid  flow.  The  drag  force  Fd  acting  on  a cylinder 
of  diameter  D and  length  l , with  the  cylindrical  axis 
perpendicular  to  the  fluid  flow,  is  given  by 

Fd  = -pv2CnDl  (15.100) 

where  Cd  is  the  coefficient  of  drag.  Figure  15.1  upper 
shows  the  dependence  of  the  drag  coefficient  Cd  as  a 
function  of  the  Reynolds  number,  for  fluid  flow  that 
is  transverse  to  a smooth  circular  cylinder.  The  lower 
part  of  figure  15.1  shows  the  streamlines  for  flow  around 
the  cylinder  at  various  Reynolds  numbers  for  the  points 
identified  by  the  letters  A,B,C,D , and  E on  the  plot 
of  the  drag  coefficient  versus  Reynolds  number  for  a 
smooth  cylinder. 

A)  At  low  velocities,  where  Re  < 1,  the  flow  is  lam- 
inar around  the  cylinder  in  that  the  low  vorticity  is 
damped  by  the  viscous  forces  and  the  ^ term  in  equa- 
tion 15.98  can  be  ignored.  The  coefficient  of  drag  Cd 


Figure  15.1:  Upper:  The  dependence  of  the  coeffi- 
cient of  drag  Cd  on  Reynolds  number  Re  for  fluid 
flow  perpendicular  to  a smooth  circular  cylinder 
of  diameter  D and  length  l.  Lower:  Typical  flow 
patterns  for  flow  past  a circular  cylinder  at  vari- 
ous Reynolds  numbers  as  indicated  in  the  upper 
figure. 
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varies  inversely  with  Re  leading  to  the  drag  forces  that  are  roughly  linear  with  velocity  as  described  in  chapter 
2.10.5.  The  size  and  velocities  of  raindrops  in  a light  rain  shower  correspond  to  such  Reynolds  numbers. 

B)  For  10  < Re  < 30  the  flow  has  two  turbulent  vortices  immediately  behind  the  body  in  the  wake  of 
the  cylinder,  but  the  flow  still  is  primarily  laminar  as  illustrated. 

C)  For  40  < Re  < 250  the  pair  of  vortices  peel  off  alternately  producing  a regular  periodic  sequence  of 
vortices  although  the  flow  still  is  laminar.  This  vortex  sheet  is  called  a von  Karrnan  vortex  sheet  for  which 
the  velocity  at  a given  position,  relative  to  the  cylinder,  is  time  dependent  in  contrast  to  the  situation  at 
lower  Reynolds  numbers. 

D)  For  103  < Re  < 105  viscous  forces  are  negligible  relative  to  the  inertial  effects  of  the  vortices  and 
boundary-layer  vortices  have  less  time  to  diffuse  into  the  larger  region  of  the  fluid,  thus  the  boundary  layer  is 
thinner.  The  boundary-layer  flow  exhibits  a small  scale  chaotic  turbulence  in  three  dimensions  superimposed 
on  regular  alternating  vortex  structures.  In  this  range  Cn  is  roughly  constant  and  thus  the  drag  forces  are 
proportional  to  the  square  of  the  velocity.  This  regime  of  Reynold  numbers  corresponds  to  typical  velocities 
of  moving  automobiles. 

E)  For  Re  « 106,  which  is  typical  of  a flying  aircraft,  the  inertial  effects  dominate  except  in  the  narrow 
boundary  layer  close  to  the  solid-fluid  interface.  The  chaotic  region  works  its  way  further  forward  on  the 
cylinder  reducing  the  volume  of  the  chaotic  turbulent  boundary  layer  which  results  in  a significant  decreases 
in  Cd-  For  a sailplane  wing  flying  at  about  50 knots,  the  boundary  layer  at  the  leading  edge  of  the  cylinder 
reduces  to  the  order  of  a millimeter  in  thickness  at  the  leading  edge  and  a centimeter  at  the  trailing  edge.  At 
these  Reynold’s  numbers  the  airflow  comprises  a thin  boundary  layer,  where  viscous  effects  are  important, 
plus  fluid  flow  in  the  bulk  of  the  fluid  where  the  vortex  inertial  terms  dominate  and  viscous  forces  can  be 
ignored.  That  is,  the  viscous  stress  tensor  term  V • T,  on  the  right-hand  side  of  equation  15.97,  can  be 
ignored,  and  the  Navier-Stokes  equation  reduces  to  the  simpler  Euler  equation  for  such  inviscid  fluid  flow. 

The  importance  of  the  inertia  of  the  vortices  is  illustrated  by  the  persistence  of  the  vortex  structure 
and  turbulence  over  a wide  range  of  length  scales  characteristic  of  turbulent  flow.  The  dynamic  range  of 
the  dimension  of  coherent  vortex  structures  is  enormous.  For  example,  in  the  atmosphere  the  vortex  size 
ranges  from  105m  in  diameter  for  hurricanes  down  to  10~3m  in  thin  boundary  layers  adjacent  to  an  aircraft 
wing.  The  transition  from  laminar  to  turbulent  flow  is  illustrated  by  water  flow  over  the  hull  of  a ship  which 
involves  laminar  flow  at  the  bow  followed  by  turbulent  flow  behind  the  bow  wave  and  at  the  stern  of  the 
ship.  The  broad  extent  of  the  white  foam  of  seawater  along  the  side  and  the  stern  of  a ship  illustrates  the 
considerable  energy  dissipation  produced  by  the  turbulence.  The  boundary  layer  of  a stalled  aircraft  wing 
is  another  example.  At  a high  angle  of  attack,  the  airflow  on  the  lower  surface  of  the  wing  remains  laminar, 
that  is,  the  stream  velocity  profile,  relative  to  the  wing,  increases  smoothly  from  zero  at  the  wing  surface 
outwards  until  it  meets  the  ambient  air  velocity  on  the  outer  surface  of  the  boundary  layer  which  is  the  order 
of  a millimeter  thick.  The  flow  on  the  top  surface  of  the  wing  initially  is  laminar  before  becoming  turbulent 
at  which  point  the  boundary  layer  rapidly  increases  in  thickness.  Further  back  the  airflow  detaches  from 
the  wing  surface  and  large-scale  vortex  structures  lead  to  a wide  boundary  layer  comparable  in  thickness  to 
the  chord  of  the  wing  with  vortex  motion  that  leads  to  the  airflow  reversing  its  direction  adjacent  to  the 
upper  surface  of  the  wing  which  greatly  increases  drag.  When  the  vortices  begin  to  shed  off  the  bounded 
surface  they  do  so  at  a certain  frequency  which  can  cause  vibrations  that  can  lead  to  structural  failure  if  the 
frequency  of  the  shedding  vortices  is  close  to  the  resonance  frequency  of  the  structure. 

Considerable  time  and  effort  are  expended  by  aerodynamicists  and  hydrodynamicists  designing  aircraft 
wings  and  ship  hulls  to  maximize  the  length  of  laminar  region  of  the  boundary  layer  to  minimize  drag. 
When  the  Reynolds  number  is  large  the  slightest  imperfections  in  the  shape  of  wing,  such  as  a speck  of 
dust,  can  trigger  the  transition  from  laminar  to  turbulent  flow.  The  boundaries  between  adjacent  large-scale 
coherent  structures  are  sensitively  identified  in  computer  simulations  by  large  divergence  of  the  streamlines 
at  any  separatrix.  A large  positive,  finite-time,  Lyapunov  exponent  identifies  divergence  of  the  streamlines 
which  occurs  at  a separatrix  between  adjacent  large-scale  coherent  vortex  structures,  whereas  the  Lyapunov 
exponents  are  negative  for  converging  streamlines  within  any  coherent  structure.  Computations  of  turbulent 
flow  often  combine  the  use  of  finite-time  Lyapunov  exponents  to  identify  coherent  structures,  plus  Lagrangian 
mechanics  for  the  equations  of  motion  since  the  Lagrangian  is  a scalar  function,  it  is  frame  independent,  and 
it  gives  far  better  results  for  fluid  motion  than  using  Newtonian  mechanics.  Thus  the  Lagrangian  approach  in 
the  continua  is  used  extensively  for  calculations  in  aerodynamics,  hydrodynamics,  and  studies  of  atmospheric 
phenomena  such  as  convection,  hurricanes,  tornadoes,  etc. 
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15.9  Summary  and  implications 

The  goal  of  this  chapter  is  to  provide  a glimpse  into  the  classical  mechanics  of  the  continua  which  introduces 
the  Lagrangian  density  and  Hamiltonian  density  formulations  of  classical  mechanics. 


Lagrangian  density  formulation:  In  three  dimensional  Lagrangian  density  £(q,  ■ q,  x,  y,  z,  t)  is 

related  to  the  Lagrangian  L by  taking  the  volume  integral  of  the  Lagrangian  density. 

L = j £(q,  V ■ q,  x,  y,  z,  t)dr  (15.21) 

Applying  Hamilton’s  Principle  to  the  three-dimensional  Lagrangian  density  leads  to  the  following  set  of 
differential  equations  of  motion 


0_  f 0£,\  d_(dz\  d_fd£\  d_  f d£\  _ dZ 

m v¥y + m vl/ + * vf  J + Tz  vly  ^ “ 


(15.22) 


Hamiltonian  density  formulation:  In  the  limit  that  the  coordinates  q,p  are  continuous,  then  the  Hamil- 
tonian density  can  be  expressed  in  terms  of  a volume  integral  over  the  momentum  density  n and  the  La- 
grangian density  £ where 


_ dZ 

77  <9q 

(15.27) 

Then  the  obvious  definition  of  the  Hamiltonian  density  fj  is 

H = 

j 

1 SjdV  = y ( 7r  • q— £)  dr 

(15.28) 

where  the  Hamiltonian  density  is  given  by 

Z)  =7T  • q— £ 

(15.29) 

These  Lagrangian  and  Hamiltonian  density  formulations  are  of  considerable  importance  to  field  theory 
and  fluid  mechanics. 


Linear  elastic  solids:  The  theory  of  continuous  systems  was  applied  to  the  case  of  linear  elastic  solids. 
The  stress  tensor  T is  a rank  2 tensor  defined  as  the  ratio  of  the  force  vector  dF  and  the  surface  element 
vector  rlA.  That  is,  the  force  vector  is  given  by  the  inner  product  of  the  stress  tensor  T and  the  surface 
element  vector  d,A. 


dF  = T-riA 


(15.33) 


The  strain  tensor  <x  also  is  a rank  2 tensor  defined  as  the  ratio  of  the  strain  vector  £ and  infinitessimal 
area  rlA. 

d£  = cr-dA  (15.38) 

where  the  component  form  of  the  rank  2 strain  tensor  is 


jji 

dx  i 

dx  2 

dx  3 

^2 

dt  2 

dx  1 

dx  2 

dx  3 

d£ i 

dx  1 

dx  2 

dx  3 

(15.39) 


The  modulus  of  elasticity  is  defined  as  the  slope  of  the  stress-strain  curve.  For  linear,  homogeneous, 
elastic  matter,  the  potential  energy  density  U separates  into  diagonal  and  off-diagonal  components  of  the 
strain  tensor 


A ^2  (CT«)2  + 2m  '^2  (aikf 


(15.42) 


where  the  constants  A and  p,  are  Lame’s  moduli  of  elasticity  which  are  positive.  The  stress  tensor  is  related 
to  the  strain  tensor  by 


r«  = A 4,  Eg 


d §i 

dxj 


dxi 


AH  + -££)=  Ad 


U “t-  2/icq 


(15.43) 
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CHAPTER  15.  ANALYTICAL  FORMULATIONS  FOR  CONTINUOUS  SYSTEMS 


Electromagnetic  field  theory:  The  rank  2 Maxwell  stress  tensor  T has  components 


1 


Tij  = eo  ( E-i Ej  — —SijE  ) H ( BiBj  — —SijB 

To 


1 


1 


(15.71) 


The  divergence  theorem  allows  the  total  electromagnetic  force,  acting  of  the  volume  r,  to  be  written  as 


dS 

F=  I ( V • T-e0/u0  — ) dr  = (f>  T-da-e0p0^  / Sdr 


’dt 


(15.74) 


The  total  momentum  flux  density  is  given  by 


d 


{'^rnech  A T? f ield)  V • T 


(15.79) 

where  the  electromagnetic  field  momentum  density  is  given  by  the  Poynting  vector  S as  7r fieid=eoTo^- 


Ideal  fluid  dynamics:  Mass  conservation  leads  to  the  continuity  equation 

! + v-(„v)  = 0 


(15.82) 


Euler’s  hydrodynamic  equation  gives 

^ + (v-V)v  = -iv(P  + /9E)  (15.90) 

where  V is  the  scalar  gravitational  potential.  If  the  flow  is  irrotational  and  time  independent  then 

+ P + pv\  = constant  (15.94) 

Viscous  fluid  dynamics:  For  incompressible  flow  the  stress  tensor  term  simplifies  to  V ■ T =/iV2v.  Then 
the  Navier-Stokes  equation  becomes 

= -VP  + gV2v+ f (15.98) 

where  pV2v  is  the  viscosity  drag  term.  The  left-hand  side  of  equation  15.98  represents  the  rate  of  change 
of  momentum  per  unit  volume  while  the  right-hand  side  represents  the  summation  of  the  forces  per  unit 
volume  that  are  acting. 

The  Reynolds  number  is  a dimensionless  number  that  characterizes  the  ratio  of  inertial  forces  to  viscous 
forces  in  a viscous  medium.  The  evolution  of  flow  from  laminar  flow  to  turbulent  flow,  with  increase  of 
Reynolds  number,  was  discussed. 

The  classical  mechanics  of  continuous  fields  encompasses  a remarkably  broad  range  of  phenomena  with 
important  applications  to  laminar  and  turbulent  fluid  flow,  gravitation,  electromagnetism,  relativity,  and 
quantum  fields. 


dv 

s + ,'v' 


Chapter  16 


Relativistic  mechanics 


16.1  Introduction 

Newtonian  mechanics  incorporates  the  Newtonian  concept  of  the  complete  separation  of  space  and  time. 
This  theory  reigned  supreme  from  inception,  in  1687,  until  November  1905  when  Einstein  pioneered  the 
Special  Theory  of  Relativity.  Relativistic  mechanics  undermines  the  Newtonian  concepts  of  absoluteness  of 
time  that  is  inherent  to  Newton’s  formulation,  as  well  as  when  recast  in  the  Lagrangian  and  Hamiltonian 
formulations  of  classical  mechanics.  Relativistic  mechanics  has  had  a profound  impact  on  twentieth-century 
physics  and  the  philosophy  of  science.  Classical  mechanics  is  an  approximation  of  relativistic  mechanics 
that  is  valid  for  velocities  much  less  than  the  velocity  of  light  in  vacuum.  The  term  "relativity"  refers  to 
the  fact  that  physical  measurements  are  always  made  relative  to  some  chosen  reference  frame.  Naively  one 
may  think  that  the  transformation  between  different  reference  frames  is  trivial  and  contains  little  underlying 
physics.  However,  Einstein  showed  that  the  results  of  measurements  depend  on  the  choice  of  coordinate 
system,  which  revolutionized  our  concept  of  space  and  time. 

Einstein’s  work  on  relativistic  mechanics  comprised  two  major  advances.  The  first  advance  is  the  1905 
Special  Theory  of  Relativity  which  refers  to  nonaccelerating  frames  of  reference.  The  second  major  advance 
was  the  1916  General  Theory  of  Relativity  which  considers  accelerating  frames  of  reference  and  their  relation 
to  gravity.  Thus  the  Special  Theory  is  a limiting  case  of  the  General  Theory  of  Relativity.  The  mathematically 
complex  General  Theory  of  Relativity  is  required  for  describing  accelerating  frames,  gravity,  plus  related 
topics  like  Black  Holes,  or  extremely  accurate  time  measurements  inherent  to  the  Global  Positioning  System. 
The  present  discussion  will  focus  primarily  on  the  mathematically  simple  Special  Theory  of  Relativity  since  it 
encompasses  most  of  the  physics  encountered  in  atomic,  nuclear  and  high  energy  physics.  This  chapter  uses 
the  basic  concepts  of  the  Special  Theory  of  Relativity  to  investigate  the  implications  of  extending  Newtonian, 
Lagrangian  and  Hamiltonian  formulations  of  classical  mechanics  into  the  relativistic  domain.  The  Lorentz- 
invariant  extended  Hamiltonian  and  Lagrangian  formalisms  are  introduced  since  they  are  applicable  to  the 
Special  Theory  of  Relativity.  The  General  Theory  of  Relativity  incorporates  the  gravitational  force  as  a 
geodesic  phenomena  in  a four-dimensional  Reimannian  structure  based  on  space,  time,  and  matter.  A 
superficial  introduction  will  be  given  to  the  fundamental  concepts  and  evidence  that  underlie  the  General 
Theory  of  Relativity. 


16.2  Galilean  Invariance 

As  discussed  in  chapter  2.3,  an  inertial  frame  is  one  in  which  Newton’s  Laws  of  motion  apply.  Inertial  frames 
are  non-accelerating  frames  so  that  pseudo  forces  are  not  induced.  All  reference  frames  moving  at  constant 
velocity  relative  to  an  inertial  reference,  are  inertial  frames.  Newton’s  Laws  of  nature  are  the  same  in  all 
inertial  frames  of  reference  and  therefore  there  is  no  way  of  determining  absolute  motion  because  no  inertial 
frame  is  preferred  over  any  other.  This  is  called  Galilean-Newtonian  invariance.  Galilean  invariance  assumes 
that  the  concepts  of  space  and  time  are  completely  separable.  Time  is  assumed  to  be  an  absolute  quantity 
that  is  invariant  to  transformations  between  coordinate  systems  in  relative  motion.  Also  the  element  of 
length  is  the  same  in  different  Galilean  frames  of  reference. 
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Consider  two  coordinate  systems  shown  in  figure  16.1,  where  the  primed  frame  is  moving  along  the  x 
axis  of  the  fixed  unprimed  frame.  A Galilean  transformation  implies  that  the  following  relations  apply; 


x[ 

= X\  — vt 

(16.1) 

x'2 

= X2 

x'i 

= x3 

t' 

= t 

Note  that  at  any  instant  t,  the  infinitessimal  units  of  length 
in  the  two  systems  are  identical  since 


ds~  = 2_^  dxf  = 2_  dx[  = ds 

i= 1 i=  1 


,/2 


(16.2) 


These  are  the  mathematical  expression  of  the  Newtonian  idea 
of  space  and  time.  An  immediate  consequence  of  the  Galilean 
transformation  is  that  the  velocity  of  light  must  differ  in  dif- 
ferent inertial  reference  frames. 

At  the  end  of  the  19th  century  physicists  thought  they  had 
discovered  a way  of  identifying  an  absolute  inertial  frame  of 
reference,  that  is,  it  must  be  the  frame  of  the  medium  that 
transmits  light  in  vacuum.  Maxwell’s  laws  of  electromagnetism 
predict  that  electromagnetic  radiation  in  vacuum  travels  at  c = 
1 = 2.998  x 108m/s.  Maxwell  did  not  address  in  what 


Figure  16.1:  Motion  of  the  primed  frame 
along  the  X\  axis  with  velocity  v relative  to 
the  parallel  unprimed  frame. 


B 


frame  of  reference  that  this  speed  applied.  In  the  nineteenth 
century  all  wave  phenomena  were  transmitted  by  some  medium,  such  as  waves  on  a string,  water  waves, 
sound  waves  in  air.  Physicists  thus  envisioned  that  light  was  transmitted  by  some  unobserved  medium  which 
they  called  the  ether.  This  ether  had  mystical  properties,  it  existed  everywhere,  even  in  outer  space,  and  yet 
had  no  other  observed  consequences.  The  ether  obviously  should  be  the  absolute  frame  of  reference. 

In  the  1880's,  Michelson  and  Morley  performed  an  experi- 
ment in  Cleveland  to  try  to  detect  this  ether.  They  transmitted 
light  back  and  forth  along  two  perpendicular  paths  in  an  inter- 
ferometer, shown  in  figure  16.2,  and  assumed  that  the  earth’s 
motion  about  the  sun  led  to  movement  through  the  ether. 

The  time  taken  to  travel  a return  trip  takes  longer  in  a 
moving  medium,  if  the  medium  moves  in  the  direction  of  the 
motion,  compared  to  travel  in  a stationary  medium.  For  ex- 
ample, you  lose  more  time  moving  against  a headwind  than 
you  gain  travelling  back  with  the  wind.  The  time  difference 
At,  for  a round  trip  to  a distance  L,  between  travelling  in  the 
direction  of  motion  in  the  ether,  versus  travelling  the  same  dis- 
tance perpendicular  to  the  movement  in  the  ether,  is  given  by 
At  « where  v is  the  relative  velocity  of  the  ether  and  c 

is  the  velocity  of  light. 

Interference  fringes  between  perpendicular  light  beams  in 
an  optical  interferometer  provides  an  extremely  sensitive  mea- 
sure of  this  time  difference.  Michelson  and  Morley  observed  no 
measurable  time  difference  at  any  time  during  the  year,  that 
is,  the  relative  motion  of  the  earth  within  the  ether  is  less  than 

1 /6  the  velocity  of  the  earth  around  the  sun.  Their  conclusion  was  either,  that  the  ether  was  dragged  along 
with  the  earth,  or  the  velocity  of  light  was  dependent  on  the  velocity  of  the  source,  but  these  did  not  jibe 
with  other  observations.  Their  disappointment  at  the  failure  of  this  experiment  to  detect  evidence  for  an  ab- 
solute inertial  frame  is  important  and  confounded  physicists  for  two  decades  until  Einstein’s  Special  Theory 
of  Relativity  explained  the  result. 


Figure  16.2:  The  Michelson  interferometer 
used  for  the  Michelson-Morley  experiment. 
Interference  of  the  two  beams  of  coherent 
light  leads  to  fringes  that  depends  on  the 
differences  in  phase  along  the  two  paths. 
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16.3  Special  Theory  of  Relativity 

16.3.1  Einstein  Postulates 

In  November  1905,  at  the  age  of  26,  Einstein  published  a seminal  paper  entitled  ”On  the  electrodynamics  of 
moving  bodies” . He  considered  the  relation  between  space  and  time  in  inertial  frames  of  reference  that  are 
in  relative  motion.  In  this  paper  he  made  the  following  postulates. 

1)  The  laws  of  nature  are  the  same  in  all  inertial  frames  of  reference. 

2)  The  velocity  of  light  in  vacuum  is  the  same  in  all  inertial  frames  of  reference. 

Note  that  Einstein’s  first  postulate,  coupled  with  Maxwell’s  equations,  leads  to  the  statement  that  the 
velocity  of  light  in  vacuum  is  a universal  constant.  Thus  the  second  postulate  is  unnecessary  since  it  is  an 
obvious  consequence  of  the  first  postulate  plus  Maxwell’s  equations  which  are  basic  laws  of  physics.  This 
second  postulate  explained  the  null  result  of  the  Michelson-Morley  experiment.  However,  it  was  not  this 
experimental  result  that  led  Einstein  to  the  theory  of  special  relativity;  he  deduced  the  Special  Theory  of 
Relativity  from  consideration  of  Maxwell’s  equations  of  electromagnetism.  Although  Einstein’s  postulates 
appear  reasonable,  they  lead  to  the  following  surprising  implications. 


16.3.2  Lorentz  transformation 


Galilean  invariance  leads  to  violation  of  the  Einstein  postulate  that  the  velocity  of  light  is  a universal  con- 
stant in  all  frames  of  reference.  It  is  necessary  to  assume  a new  transformation  law  that  renders  physical 
laws  relativistically  invariant.  Maxwell’s  equations  are  relativistically  invariant,  which  led  to  some  electro- 
magnetic phenomena  that  could  not  be  explained  using  Galilean  invariance.  In  1904  Lorentz  proposed  a new 
transformation  to  replace  the  Galilean  transformation  in  order  to  explain  such  electromagnetic  phenomena. 
Einstein’s  genius  was  that  he  derived  the  transformation,  that  had  been  proposed  by  Lorentz,  directly  from 
the  postulates  of  the  Special  Theory  of  Relativity.  The  Lorentz  transformation  satisfies  Einstein’s  theory  of 
relativity,  and  has  been  confirmed  to  be  correct  by  many  experiments. 

For  the  geometry  shown  in  figure  16.1,  the  Lorentz  transformations  are: 


where  the  Lorentz  7 factor 


The  inverse  transformations  are 


x'  = 7 (x  — vt) 

y'  = y 

z'  = z 


1 


(16.3) 


(16.4) 


x = 7 (x'  + vt')  (16.5) 

y = y’ 


The  Lorentz  7 factor,  defined  above,  is  the  key  feature 
differentiating  the  Lorentz  transformations  from  the  Galilean 
transformation.  Note  that  7 > 1;  also  7 — > 1.0  as  v — > 0,  and 
increases  to  infinity  as  - — > 1 as  illustrated  in  figure  16.3.  A 
useful  fact  that  will  be  used  later  is  that  for  - <<  1; 

C 7 


_v 

c 


7 


Limit  for  v « c 


Note  that  for  v « c then  7=1  and  the  Lorentz  trans- 
formation is  identical  to  the  Galilean  transformation. 


Figure  16.3:  The  dependence  of  the  Lorentz 
7 factor  on  -. 

' C 
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Figure  16.4:  The  observer  and  mirror  are  at  rest  in  the  left-hand  frame  (a).  The  light  beam  takes  a time 
At  = - to  travel  to  the  mirror.  In  the  right-hand  frame  (b)  the  source  and  mirror  are  travelling  at  a velocity 
v relative  to  the  observer.  The  light  travels  further  in  the  right-hand  frame  of  reference  (b)  than  is  the 
stationary  frame  (a) . Since  Einstein  states  that  the  velocity  of  light  is  the  same  in  both  frames  of  reference 
then  the  time  interval  must  by  larger  in  frame  (b)  since  the  light  travels  further  than  in  (a). 


16.3.3  Time  Dilation: 

Consider  that  a clock  is  fixed  at  x'a  in  a moving  frame  and  measures  the  time  interval  between  two  events 
in  the  moving  frame,  i.e.  A t'p  = t\  — t'2.  According  to  the  Lorentz  transformation,  the  times  in  the  fixed 
frame  are  given  by: 

t\ 
t‘2 

Thus  the  time  interval  is  given  by: 

t2-ti=7(t2-ti)  (16-7) 

The  time  between  events  in  the  rest  frame  of  the  clock,  At  = A t'p  is  called  the  proper  time  which  always 
is  the  shortest  time  measured  for  a given  event  and  is  represented  by  the  symbol  r.  That  is 


At  = 'yAt'p  = 7A7-  (16.8) 

Note  that  the  time  interval  for  any  other  frame  of  reference,  moving  with  respect  to  the  clock  frame,  will 
show  larger  time  intervals  because  7 > 1.0  which  implies  that  the  fixed  frame  perceives  that  the  moving 
clock  is  slow  by  the  factor  7. 

The  plausibility  of  this  time  dilation  can  be  understood  by  looking  at  the  simple  geometry  of  the  space 
ship  example  shown  in  Figure  16.4.  Pretend  that  the  clock  in  the  proper  frame  of  the  space  ship  is  based  on 
the  time  for  the  light  to  travel  to  and  from  the  mirror  in  the  space  ship.  In  this  proper  frame  the  light  has 
the  shortest  distance  to  travel,  and  the  proper  transit  time  is 

At  = — (16.9) 

c 


In  the  fixed  frame  b the  component  of  velocity  in  the  direction  of  the  mirror  is  V c2  — v2  using  the  Pythagorus 
theorem,  assuming  that  the  light  cannot  travel  faster  the  c.  Thus  the  transit  time  towards  and  back  from 


the  mirror  must  be 


(16.10) 


which  is  the  predicted  time  dilation. 
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There  are  many  experimental  verifications  of  time  dilation  in  physics.  For  example,  a stationary  muon 
has  a mean  lifetime  of  tp  = 2psec,  whereas  the  lifetime  of  a fast  moving  muon,  produced  in  the  upper 
atmosphere  by  high-energy  cosmic  rays,  was  observed  in  1941  to  be  longer  and  given  by  as  described  in 
example  16.1.  In  1972  Hafely  and  Keating  used  four  accurate  cesium  atomic  clocks  to  confirm  time  dilation. 
Two  clocks  were  flown  on  regularly  scheduled  airlines  travelling  around  the  World,  one  westward  and  the 
other  eastward.  The  other  two  clocks  were  used  for  reference.  The  westward  moving  clock  was  slow  by 
(273  ± 7 )nsec  compared  to  the  predicted  value  of  (275  ± 10)n  sec.  The  Global  Positioning  System  of  24 
geosynchronous  satellites  is  used  for  locating  positions  to  within  a few  meters.  It  has  an  accuracy  of  a few 
nanoseconds  which  requires  allowance  for  time  dilation  and  is  a daily  tribute  to  the  correctness  of  Einstein’s 
Theory  of  Relativity. 


16.3.4  Length  Contraction 

The  Lorentz  transformation  leads  to  a contraction  of  the  apparent  length  of  an  object  in  a moving  frame 
as  seen  from  a fixed  frame.  The  length  of  a ruler  in  its  own  frame  of  reference  is  called  the  proper  length. 
Consider  that  we  place  an  accurately  known  rod  of  proper  length  Lp  = x'2  — x;\  that  is,  at  rest  in  the  moving 
primed  frame.  The  locations  of  both  ends  of  this  rod  are  measured  at  a given  time  in  the  stationary  frame, 
t\  = t2,  by  taking  a photograph  of  the  moving  rod.  The  corresponding  locations  in  the  moving  frame  are: 

x'2  = l{x2-vt2)  (16.11) 

x\  = 7(xi-ufi) 

Since  1 2 =h,  the  measured  lengths  in  the  two  frames  are  related  by: 

x2  ~ x'i  = 7 (x2  ~ xi)  (16.12) 


That  is,  the  lengths  are  related  by: 

L = Lp 

7 


(16.13) 


Note  that  the  moving  rod  appears  shorter  in  the  direction  of  motion.  As  v — > c the  apparent  length 
shrinks  to  zero  in  the  direction  of  motion  while  the  dimensions  perpendicular  to  the  direction  of  motion  are 
unchanged.  This  is  called  the  Lorentz  contraction.  If  you  could  ride  your  bicycle  at  close  to  the  speed  of 
light,  you  would  observe  that  stationary  cars,  buildings,  people,  all  would  appear  to  be  squeezed  thin  along 
the  direction  that  you  are  travelling.  Also  objects  that  are  further  away  down  any  side  street  would  be 
distorted  in  the  direction  of  travel.  A photograph  taken  by  a stationary  observer  would  show  the  moving 
bicycle  to  be  Lorentz  contracted  along  the  direction  of  travel  and  the  stationary  objects  would  be  normal. 


16.3.5  Simultaneity 

The  Lorentz  transformations  imply  a new  philosophy  of  space  and  time.  A surprising  consequence  is  that 
the  concept  of  simultaneity  is  frame  dependent  in  contrast  to  the  prediction  of  Newtonian  mechanics. 

Consider  that  two  events  occur  in  frame  S at  and  (2:2,12)  • In  frame  S'  these  two  events  occur  at 

(a:) , t\ ) and  {x'2 ,t2)  . From  the  Lorentz  transformation  the  time  difference  is 


*2  - *1  = 7 


(t2  - h)  - 


v (x2  - 27) 


If  an  event  is  simultaneous  in  frame  S,  that  is  (t2  — G)  = 0 then 

,/  ,/  (®i  - *2) 

k ~ k = 7 


(16.14) 


(16.15) 


Thus  the  event  is  not  simultaneous  in  frame  S'  if  (2:2  — x\ ) = Lp  0.  That  is,  an  event  that  is  simultaneous 
in  one  frame  is  not  simultaneous  in  the  other  frame  if  the  events  are  spatially  separated.  The  equivalent 
statement  is  that  for  two  clocks,  spatially  separated  by  a distance  Lp,  which  are  synchronized  in  their  rest 
frame,  then  in  a moving  frame  they  are  not  simultaneous. 
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Figure  16.5:  If  lightning  strikes  the  front  and  rear  of  the  carriage  simultaneously,  according  to  the  man  in 
the  fixed  frame,  then  the  woman  in  the  moving  frame  sees  the  flash  from  the  front  first  since  she  is  moving 
towards  that  approaching  wavefront  during  the  transit  time  of  the  light.  Thus  if  the  length  of  the  carriage 
in  the  stationary  frame  is  [x2  — aq)  = Lp  then  the  time  difference  is  At'  = yip -7. 


Einstein  discussed  the  example  shown  in  figure  16.5,  where  lightning  strikes  both  ends  of  a train  simul- 
taneously in  the  stationary  earth  frame  of  reference.  A woman  on  the  train  will  see  that  the  strikes  are 
not  simultaneous  since  the  wavefront  from  the  front  of  the  carriage  will  be  seen  first  because  she  is  moving 
forward  during  the  time  the  light  from  the  two  lightning  flashes  is  travelling  towards  her.  As  a consequence 
she  observes  that  the  two  lightning  flashes  are  not  simultaneous.  This  explains  why  measurement  of  the 
length  of  a moving  rod,  performed  by  simultaneously  locating  both  ends  in  the  fixed  frame,  implies  that  the 
measurement  occurs  at  different  times  for  both  ends  in  the  moving  frame  resulting  in  a shorter  apparent 
length.  The  lack  of  simultaneity  explains  why  one  can  get  the  apparent  inconsistency  that  the  moving  bicy- 
clist sees  that  the  stationary  street  block  to  be  length  contracted,  while  in  contrast,  a pedestrian  sees  that 
the  bicycle  is  length  contracted. 

The  concept  of  causality  breaks  down  since  {x'2  — x\ ) can  be  either  positive  or  negative,  therefore  the 
corresponding  At  can  be  positive  of  negative.  A consequence  of  the  lack  of  simultaneity  is  that  the  image 
shown  by  a photograph  of  a rapidly  moving  object  is  not  a true  representation  of  the  moving  object.  Not 
only  is  the  body  contracted  in  the  direction  of  travel,  but  also  it  appears  distorted  because  light  arriving 
from  the  far  side  of  the  body  had  to  be  emitted  earlier,  that  is,  when  the  body  was  at  an  earlier  location, 
in  order  to  reach  the  observer  simultaneously  with  light  from  the  near  side.  The  relativistic  snake  paradox, 
addressed  in  workshop  exercise  1,  is  an  excellent  example  of  the  role  of  simultaneity  in  relativistic  mechanics. 

16.1  Example:  Muon  lifetime 

Many  people  had  trouble  comprehending  the  ideas  of  time  dilation  and  Lorentz  contraction  in  the  Special 
Theory  of  Relativity.  The  predictions  appear  to  be  crazy,  but  there  are  many  examples  where  time  dilation 
and  Lorentz  contraction  are  observed  experimentally  such  as  the  decay  in  flight  of  the  muon.  At  rest,  the 
muon  decays  with  a mean  lifetime  of  2 psec.  Muons  are  created  high  in  the  atmosphere  due  to  cosmic  ray 
bombardment.  A typical  muon  travels  at  v = 0.998c  which  corresponds  to  7 = 15.  Time  dilation  implies 
that  the  lifetime  of  the  moving  muon  in  the  earth ’s  frame  of  reference  is  30  psec.  The  speed  of  the  muon  is 
essentially  c in  both  frames  of  reference,  and  it  would  travel  600 m in  2 ps  and  9000 m in  30  ps.  In  fact, 
it  is  observed  that  the  muon  does  travel,  on  average,  9000m  in  the  earth  frame  of  reference  before  decaying. 
Is  this  inconsistent  with  the  view  of  someone  travelling  with  the  muon?  In  the  muon’s  moving  frame,  the 
lifetime  is  only  2 ps,  but  the  Lorentz  contraction  of  distance  means  that  9000m  in  the  earth  frame  appears 
to  be  only  600 m in  the  moving  frame;  a distance  it  travels  is  2 p sec.  Thus  in  both  frames  of  reference  we 
have  consistent  explanations,  that  is,  the  muon  travels  the  height  of  the  mountain  in  one  lifetime. 
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16.2  Example:  Relativistic  Doppler  Effect 

The  relativistic  Doppler  effect  is  encountered  frequently  in  physics  and  astronomy.  Consider  monochro- 
matic electromagnetic  radiation  from  a source,  such  as  a star,  that  is  moving  towards  the  detector  at  a 
velocity  v.  During  the  time  At  in  the  frame  of  the  receiver,  the  source  emits  n cycles  of  the  sinusoidal 
waveform.  Thus  the  length  of  this  waveform,  as  seen  by  the  receiver,  is  nX  which  equals 

nX  = (c  — v)A  t 


The  frequency  as  measured  by  the  receiver  is 

c cn 

X (c  — v)A  t 

According  to  the  source,  it  emits  n waves  of  frequency  vq  during  the  proper  time  interval  At' , that  is 

n = v$At' 


This  proper  time  interval  At' , in  the  source  frame,  corresponds  to  a time  interval  At  in  the  receiver  frame 
where 


At  = 7 A t! 


Thus  the  frequency  measured  by  the  receiver  is 


ifo  = V±EWv  = /l±Z 

(1  — c)  7 (1-1)  0 V1--5 


where  fl  = |,  This  formula  for  source  and  receiver  approaching  each  other  also  gives  the  correct  answer  for 
source  and  receiver  receding  if  the  sign  of  /3  is  changed. 

This  relativistic  Doppler  Effect  accounts  for  the  red  shift  observed  for  light  emitted  by  receding  stars  and 
galaxies,  as  well  as  many  examples  in  atomic  and  nuclear  physics  involving  moving  sources  of  electromagnetic 
radiation. 


16.3  Example:  Twin  paradox 

A problem  that  troubled  physicists  for  many  years  is  called  the  twin  paradox.  Consider  two  identical 
twins,  Jack  and  Jill.  Assume  that  Jill  travels  in  a space  ship  at  a speed  of  7 = 4 for  20  years,  as  measured 
by  Jack's  clock,  and  then  returns  taking  another  20  years,  according  to  Jack.  Thus,  Jack  has  aged  40  years 
by  the  time  his  twin  sister  returns  home.  However,  Jill’s  clock  measures  20/4  = 5 years  for  each  half  of  the 
trip  so  that  she  thinks  she  travelled  for  10  years  total  time  according  to  her  clock.  Thus  she  has  aged  only  10 
years  on  the  trip,  that  is,  now  she  is  30  years  younger  that  her  twin  brother.  Note  that,  according  to  Jill,  the 
distance  she  travelled  out  and  back  was  1/4  the  distance  according  to  Jack,  so  she  perceives  no  inconsistency 
in  her  clock,  and  the  speed  of  the  space  ship.  This  was  called  a paradox  because  some  people  claimed  that 
Jill  will  perceive  that  the  earth  and  Jack  moved  away  at  the  same  relative  speed  in  the  opposite  direction  and 
thus  according  to  Jill,  Jack  should  be  30  years  younger,  not  her.  Moreover,  some  claimed  that  this  problem 
is  symmetric  and  therefore  both  twins  must  still  be  the  same  age  since  there  is  no  way  of  telling  who  was 
moving  away  from  whom.  This  argument  is  incorrect  because  Jill  was  able  to  sense  that  she  accelerated  to 
7 = 4 which  destroys  the  symmetry  argument.  The  effect  is  observed  with  accelerated  beams  of  unstable 
nuclei  such  as  the  muon  and  was  confirmed  by  the  results  of  the  experiment  where  cesium  atomic  clocks  were 
flown  around  the  Earth.  Thus  the  Twin  paradox  is  not  a paradox;  the  fact  is  that  Jill  will  be  younger  than 
her  twin  brother. 
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16.4  Relativistic  kinematics 

16.4.1  Velocity  transformations 

Consider  the  two  parallel  coordinate  frames  with  the  primed  frame  moving  at  a velocity  v along  the  x\  axis 
as  shown  in  figure  16.1.  Velocities  of  an  object  measured  in  both  frames  are  defined  to  be 

Ui  = ^ (16.16) 


Using  the  Lorentz  transformations  16.3, 16.5  between  the  two  frames  moving  with  relative  velocity  v along 
the  x\  axis,  gives  that  the  velocity  along  the  x\  axis  is 


, dx[  dx  i — vdt  u\  — v 
Ul  = W = dt  - 4ctei  = 1 - W 

Similarly  we  get  the  velocities  along  the  perpendicular  x2  and  x'3  axes  to  be 

, dx 2 u2 

U<1  = ~dtr  = 1 - ^ 

Cz 

, dx 3 U3 

U3  = = 


(16.17) 


(16.18) 


When  — > 0 these  velocity  transformations  become  the  usual  Galilean  relations  for  velocity  addition. 

Do  not  confuse  u and  u'  with  v;  that  is,  u and  u'  are  the  velocities  of  some  object  measured  in  the  unprimed 
and  primed  frames  of  reference  respectively,  whereas  v is  the  relative  velocity  of  the  origin  of  one  frame  with 
respect  to  the  origin  of  the  other  frame. 

16.4.2  Momentum 

Using  the  classical  definition  of  momentum,  that  is  p =mu,  the  linear  momentum  is  not  conserved  using  the 
above  relativistic  velocity  transformations  if  the  mass  m is  a scalar  quantity.  This  problem  originates  from 
the  fact  that  both  x and  t have  non-trivial  transformations  and  thus  u = ^ is  frame  dependent. 

Linear  momentum  conservation  can  be  retained  by  redefining  momentum  in  a form  that  is  identical  in 
all  frames  of  reference,  that  is  by  referring  to  the  proper  time  r as  measured  in  the  rest  frame  of  the  moving 
object.  Therefore  we  define  relativistic  linear  momentum  as 


dx  dx  dt 

P “md7  = m~dtH: 


(16.19) 


But  we  know  the  time  dilation  relation 


dt  = — , = 7 ,.dr 

dml) 


(16.20) 


Note  that  the  7U  in  this  relation  refers  to  the  velocity  u between  the  moving  object  and  the  frame;  this  is 
quite  different  from  the  7 = , 1 which  refers  to  the  transformation  between  the  two  frames  of  reference. 


Thus  the  new  relativistic  definition  of  momentum  is 


p^TO_  = TO7u_=7uTOU 


(16.21) 


The  relativistic  definition  of  linear  momentum  is  the  same  as  the  classical  definition  with  the  rest  mass 
m replaced  by  the  relativistic  mass  7m.1 

1Note  that,  until  recently,  the  rest  mass  was  denoted  by  mo  and  the  relativistic  mass  was  referred  to  as  m.  Modern  texts 
denote  the  rest  mass  by  m and  the  relativistic  mass  by  7 m.  This  book  follows  the  modern  nomenclature  for  rest  mass  to  avoid 
confusion. 
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16.4.3  Center  of  momentum  coordinate  system 

The  classical  relations  for  handling  the  kinematics  of  colliding  objects,  carry  over  to  special  relativity  when  the 
relativistic  definition  of  linear  momentum,  equation  16.21,  is  assumed.  That  is,  one  can  continue  to  apply 
conservation  of  linear  momentum.  However,  there  is  one  important  conceptual  difference  for  relativistic 
dynamics  in  that  the  center  of  mass  no  longer  is  a meaningful  concept  due  to  the  interrelation  of  mass 
and  energy.  However,  this  problem  is  eliminated  by  considering  the  center  of  momentum  coordinate  system 
which,  as  in  the  non-relativistic  case,  is  the  frame  where  the  total  linear  momentum  of  the  system  is  zero. 
Using  the  concept  of  center  of  momentum  allows  use  of  the  formalism  of  classical  non-relativistic  kinematics. 


16.4.4  Force 

Newton’s  second  law  F is  covariant  under  a Galilean  transformation.  In  special  relativity  this  definition 
also  applies  using  the  relativistic  definition  of  momentum  p.  The  fact  that  the  relativistic  momentum  p 
is  conserved  in  the  force-free  situation,  leads  naturally  to  using  the  definition  of  force  to  be 


F 


dp 

dt 


Then  the  relativistic  momentum  is  conserved  if  F =0. 


(16.22) 


16.4.5  Energy 

The  classical  definition  of  work  done  is  defined  by 


IU12  = [ F-dr=T2-Ti 


(16.23) 


Assume  T\  = 0,  let  dr  = u dt  and  insert  the  relativistic  force  relation  in  equation  16.23,  gives 


W = T = 


dt 


r 

(7„mu)  -u dt  = m ud  (p/uu) 

Jo 


(16.24) 


Integrate  by  parts,  followed  by  algebraic  manipulation,  gives 


T 


7„mu2 


udu 


lumu2  + me2 


1 T — mc~ 

c- 


mu2 


— me2  = me2  (yu  — 1) 


Define  the  rest  energy  Eq 
and  total  relativistic  energy  E 

then  equation  16.25  can  be  written  as 


Eq  = me2 


E = "iumc2 


E = T + E0=  'Jumc2 


(16.25) 


(16.26) 

(16.27) 

(16.28) 


This  is  the  famous  Einstein  relativistic  energy  that  relates  the  equivalence  of  mass  and  energy.  The  total 
relativistic  energy  E is  a conserved  quantity  in  nature.  It  is  an  extension  of  the  conservation  of  energy  and 
manifestations  of  the  equivalence  of  energy  and  mass  occur  extensively  in  the  real  world. 

In  nuclear  physics  we  often  convert  mass  to  energy  and  back  again  to  mass.  For  example,  gamma 
rays  with  energies  greater  than  1.022AfeU,  which  are  pure  electromagnetic  energy,  can  be  converted  to  an 
electron  plus  positron  both  of  which  have  rest  mass.  The  positron  can  then  annihilate  a different  electron  in 
another  atom  resulting  in  emission  of  two  511fceU  gamma  rays  in  back  to  back  directions  to  conserve  linear 
momentum.  A dramatic  example  of  Einstein’s  equation  is  a nuclear  reactor.  One  gram  of  material,  the  mass 
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of  a paper  clip,  provides  E = 9 x 1013joules.  This  is  the  daily  output  of  a 1 GWatt  nuclear  power  station  or 
the  explosive  power  of  the  Nagasaki  or  Hiroshima  bombs. 

As  the  velocity  of  a particle  v approaches  c then  7 and  the  relativistic  mass  7 m both  approach  infinity. 
This  means  that  the  force  needed  to  accelerate  the  mass  also  approaches  infinity,  and  thus  no  particle  can 
exceed  the  velocity  of  light.  The  energy  continues  to  increase  not  by  increasing  the  velocity  but  by  increase 
of  the  relativistic  mass.  Although  the  relativistic  relation  for  kinetic  energy  is  quite  different  from  the 
Newtonian  relation,  the  Newtonian  form  is  obtained  for  the  case  of  u « c in  that 

v?  1 1 u2  1 

T = rac2(l -)~3  ~ m< 32  = mc2(l  + + • • •)  — me2  = -mu2  (16.29) 

&■  2 r 2 

An  especially  useful  relativistic  relation  that  can  be  derived  from  the  above  is 

E 2 = p2c 2 + El  (16.30) 

This  is  useful  because  it  provides  a simple  relation  between  total  energy  of  a particle  and  its  relativistic 
linear  momentum  plus  rest  energy. 


16.4  Example:  Rocket  propulsion 

Consider  a rocket,  having  initial  mass  M,  is  accelerated  in  a straight  line  in  free  space  by  exhausting 
propellant  at  a constant  speed  vp  relative  to  the  rocket.  Let  u be  the  speed  of  the  rocket  relative  to  it’s  initial 
rest  frame  S,  when  its  rest  mass  has  decreased  to  m.  At  this  instant  the  rocket  is  at  rest  in  the  inertial  frame 
S' . At  a proper  time  r + dr  the  rest  mass  is  m—  dm  and  it  has  acquired  a velocity  increment  du  relative  to 
S'  and  propellant  of  rest  mass  dmp  has  been  expelled  with  velocity  vp  relative  to  S'.  At  proper  time  r in  S' 
the  rest  mass  is  me2.  At  the  time  r + dr,  energy  conservation  requires  that 

7U,  (m  — dm)  c2  + 7 Vpmpc2  = me 2 
At  the  same  instant,  conservation  of  linear  momentum  requires 

lu'  ( m — dm)  du'  — 7 vpdmp  = 0 
To  first  order  these  two  equations  simplify  to 


dmn  = 


- 


dm 


Therefore 


mdu'  = dmpr)v  vp 


mdu’  = vpdm 


(a) 


The  velocity  increment  du'  in  frame  S'  can  be  transformed  back  to  frame  S using  equation  16.5,  that  is 


d + du  = 


u + du' 

2 _|_  udu' 


u+  1 - - 


©' 


du' 


(b) 


Equations  a and  b yield  a differential  equation  for  u(m)  of 

du 


dm 

1 _ (h)2  Vp  m 


Integrate  the  left-hand  side  between  0 and  u and  the  right-hand  side  between  M and  m gives 


1 


cln 


1 + 2 

1 C 
1 - ^ 


= —vp  In 


This  reduces  to 


u 

c 


l-(f) 


2%,/c 


,2vp/c 


When  2 


!+(§r 

0 this  equation  reduces  to  the  non-relativistic  answer  given  in  equation  2.123. 
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16.5  Geometry  of  space-time 

16.5.1  Four-dimensional  space-time 

In  1906  Poincare  showed  that  the  Lorentz  transformation  can  be  regarded  as  a rotation  in  a 4-dimensional 
Euclidean  space-time  produced  by  adding  an  imaginary  fourth  space-time  coordinate  ict  to  the  three  real 
spatial  coordinates.  In  1908  Minkowski  reformulated  Einstein’s  Special  Theory  of  Relativity  in  this  4- 
dimensional  Euclidean  space-time  vector  space  and  concluded  that  the  spatial  variables  q.,,  where  (i  = 1, 2, 3) , 
plus  the  time  q3  = ict  are  equivalent  variables  and  should  be  treated  equally  using  a covariant  representation 
of  both  space  and  time.  The  idea  of  using  an  imaginary  time  axis  ict  to  make  space-time  Euclidean  was 
elegant,  but  it  obscured  the  non-Euclidean  nature  of  space-time  as  well  as  causing  difficulties  when  generalized 
to  non-inertial  accelerating  frames  in  the  General  Theory  of  Relativity.  As  a consequence,  the  use  of  the 
imaginary  ict  has  been  abandoned  in  modern  work.  Minkowski  developed  an  alternative  non-Euclidean 
metric  that  treats  all  four  coordinates  ( ct , x,  y,  z)  as  a four-dimensional  Minkowski  metric  with  all  coordinates 
being  real,  and  introduces  the  required  minus  sign  explicitly. 

Analogous  to  the  usual  3-dimensional  cartesian  coordinates,  the  displacement  four  vector  ds  is  defined 
using  the  four  components  along  the  four  unit  vectors  in  either  the  unprimed  or  primed  coordinate  frames. 


ds  = dx°e0  + dxxei  + dx2e2  + dx3e3  = dx'°e'0  + dx ''ej  + dx'2e'2  + dx,3e'3  (16.31) 

The  convention  used  is  that  greek  subscripts  (covariant)  or  superscripts  (contravariant)  designate  a four 
vector  with  0 < p,  < 3.  The  covariant  unit  vectors  are  written  with  the  subscript  y which  has  4 values 
0 < p,  < 3.  As  described  in  appendix  E 3,  using  the  Einstein  convention  the  components  are  written  with 
the  contravariant  superscript  dxT  where  the  time  axis  x°  = ct,  while  the  spatial  coordinates,  expressed  in 
cartesian  coordinates,  are  x1  = x,  x2  = y,  and  a:3  = z.  With  respect  to  a different  (primed)  unit  vector  basis 
e'  the  displacement  must  be  unchanged  as  given  by  equation  16.31.  In  addition,  equation  16.43  shows  that 
the  magnitude  |ds|2  of  the  displacement  four  vector  is  invariant  to  a Lorentz  transformation. 

The  most  general  Lorentz  transformation  between  inertial  coordinate  systems  S and  S',  in  relative  motion 
with  velocity  v,  assuming  that  the  two  sets  of  axes  are  aligned,  and  that  their  origins  overlap  when  t = t'  = 0, 
is  given  by  the  symmetric  matrix  A where 

x'"  = V®*  (16.32) 

V 

This  Lorentz  transformation  of  the  four  vector  X components  can  be  written  in  matrix  form  as 


X'  = AX 


(16.33) 


Assuming  that  the  two  sets  of  axes  are  aligned,  then  the  elements  of  the  Lorentz  transformation  A ^ 
given  by 


are 


( Ct’  \ 


X'  = 


A 

u 

„/2 


3 


V * 

where  j3  = - and  7 


/ 7 

-7/3 1 
-7/32 
-70s 


1)? 


V 
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-701 
1 + (7 
(7-1)^ 
(7 


1 1 IS  @2 


1 \ ft  iff  3 
>~F~ 


-702 

(7 

1 + (7-  % 
(7-1)^ 
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-703 

(7-1)^ 

(7-  i)Mz 


1 + (7- 


p2 

lift! 


\ 


J 


( c!  \ 


V*3/ 


(16.34) 


and  assuming  that  the  origin  of  S transforms  to  the  origin  of  S'  at  (0,0,  0,  0). 


For  the  case  illustrated  in  figure  16.1,  where  the  corresponding  axes  of  the  two  frames  are  parallel  and  in 
relative  motion  with  velocity  v in  the  X\  direction,  then  the  Lorentz  transformation  matrix  16.34  reduces  to 


/ ct'  \ 

/ 

7 

-07 

0 

0 ^ 

( 

ct  \ 

x'1 

-07 

7 

0 

0 

X 1 

x'2 

0 

0 

1 

0 

x2 

\x'3  ) 

V 

0 

0 

0 

1 ) 

\ 

a:3  ) 

(16.35) 


This  Lorentz  transformation  matrix  is  called  a standard  boost  since  it  only  boosts  from  one  frame  to  another 
parallel  frame.  In  general  a rotation  matrix  also  is  incorporated  into  the  transformation  matrix  A for  the 
spatial  variables. 
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16.5.2  Four-vector  scalar  products 

Scalar  products  of  vectors  and  tensors  usually  are  invariant  to  rotations  in  three-dimensional  space  providing 
an  easy  way  to  solve  problems.  The  scalar,  or  inner,  product  of  two  four  vectors  is  defined  by 


X • ¥ 


g^X^Y"  = ( X°  X1  X2  X3 
X°Y°  - X 1y1  - X2Y2  - X3Y3 


/ 1 0 0 

0-10 
0 0 -1 

\ 0 0 0 


\ 

f Y°  \ 

Y1 

Y2 

/ 

\ Y 3 / 

(16.36) 


The  correct  sign  of  the  inner  product  is  obtained  by  inclusion  of  the  Minkowski  metric  g 


defined  by 


9fiu  — &/i  ' &ii 

that  is,  it  can  be  represented  by  the  matrix 

/ 1 0 0 0 \ 

0-10  0 
9~  0 0 -10 

\ 0 0 0 -1  / 


(16.37) 


(16.38) 


The  sign  convention  used  in  the  Minkowski  metric,  equation  16.38,  has  been  chosen  with  the  time  coordinate 
(cty  positive  which  makes  ( ds ) > 0 for  objects  moving  at  less  than  the  speed  of  light  and  corresponds  to 
ds  being  real.2 

The  presence  of  the  Minkowski  metric  matrix,  in  the  inner  product  of  four  vectors,  complicates  General 
Relativity  and  thus  the  Einstein  convention  has  been  adopted  where  the  components  of  the  contravariant 
four-vector  X are  written  with  superscripts  X11.  See  also  appendix  E.  The  corresponding  covariant  four- 
vector  components  are  written  with  the  subscript  X M which  is  related  to  the  contravariant  four-vector 
components  Xv  using  the  gv  component  of  the  covariant  Minkowski  metric  matrix  g.  That  is 


3 

xlt  = Y,,Jv-xv  (16-39) 

i/=0 

The  contravariant  metric  component  is  defined  as  the  gv  component  of  the  inverse  metric  matrix  g_1 
where 

gg_1  = I = g_1g  (16.40) 

The  contravariant  components  of  the  four  vector  can  be  expressed 

3 

X>J-  = ^ g'JVXu  (16.41) 

v=0 

Thus  equations  16.39  and  16.41  can  be  used  to  transform  between  covariant  and  contravariant  four  vectors, 
that  is,  to  raise  or  lower  the  index  g. 

The  scalar  inner  product  of  two  four  vectors  can  be  written  compactly  as  the  scalar  product  of  a covariant 
four  vector  and  a contravariant  four  vector.  The  Minkowski  metric  matrix  can  be  absorbed  into  either  X or 
¥ thus 

3 3 3 3 

gimX*Yv  = A'^%  (16.42) 

li— 0 n=0  c=0  fi=0 

If  this  covariant  expression  is  Lorentz  invariant  in  one  coordinate  system,  then  it  is  Lorentz  invariant  in  all 
coordinate  systems  obtained  by  proper  Lorentz  transformations. 

2 Older  textbooks,  such  as  all  editions  of  Marion,  and  the  first  two  editions  of  Goldstein,  use  the  Euclidean  Poincare  4- 
dimensional  space-time  with  the  imaginary  time  axis  ict.  About  half  the  scientific  community,  and  modern  physics  textbooks 
including  this  textbook  and  the  3rd  edition  of  Goldstein,  use  the  Bjorken  - Drell  +,  — , — , — , sign  convention  given  in  equation 
16.38  where  xo  = ct , and  xi,X2,X3  are  the  spatial  coordinates.  The  other  half  of  the  community,  including  mathematicians 
and  gravitation  physicists,  use  the  opposite  +,  sign  convention.  Further  confusion  is  caused  by  a few  books  that  assign 

the  time  axis  ct  to  be  X4  rather  than  no- 


where I is  the  four- vector  identity  matrix, 
in  terms  of  the  covariant  components  as 
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The  scalar  inner  product  of  the  invariant  space-time  interval  is  an  especially  important  example. 

3 

(ds)2  = X-X=c2  (dt)  2 — (dr)2  = ( cdt )2  — ^ dxf  = ( cdr )2  (16.43) 

This  is  invariant  to  a Lorentz  transformation  as  can  be  shown  by  applying  the  Lorentz  standard  boost 
transformation  given  above.  In  particular,  if  S'  is  the  rest  frame  of  the  clock,  then  the  invariant  space-time 
interval  ds  is  simply  given  by  the  proper  time  interval  dr. 

16.5.3  Minkowski  space-time 

Figure  16.6  illustrates  a three-dimensional  (ct,  a;1,  a;2)  representation  of  the  4— dimensional  space-time  dia- 
gram where  it  is  assumed  that  x3  = 0.  The  fact  that  the  velocity  of  light  has  a fixed  velocity  leads  to  the 
concept  of  the  light  cone  defined  by  the  locus  of  \x\  = ct. 

Inside  the  light  cone 

The  vertex  of  the  cones  represent  the  present.  Locations  in- 
side the  upper  cone  represent  the  future  while  the  past  is 
represented  by  locations  inside  the  lower  cone.  Note  that 
(ds)2  =c2  (dt)  2 — (dr)2  > 0 inside  both  the  future  and  past 
light  cones.  Thus  the  space-time  interval  cAt  is  real  and  pos- 
itive for  the  future,  whereas  it  is  real  and  negative  for  the 
past  relative  to  the  vertex  of  the  light  cone.  A world  line 
is  the  trajectory  a particle  follows  is  a function  of  time  in 
Minkowski  space.  In  the  interior  of  the  future  light  cone 
At  > 0 and,  since  it  is  real,  it  can  be  asserted  unambiguously 
that  any  point  inside  this  forward  cone  must  occur  later  than 
at  the  vertex  of  the  cone,  that  is,  it  is  the  absolute  future. 

A Lorentz  transformation  can  rotate  Minkowski  space  such 
that  the  axis  Xo  goes  through  any  point  within  this  light  cone 
and  then  the  "world  line"  is  pure  time  like.  Similarly,  any 
point  inside  the  backward  light  cone  unambiguously  occurred 
before  the  vertex,  i.e.  it  is  absolute  past. 

Outside  the  light  cone 

Outside  of  the  light  cone,  has  (ds)2  =c2  (dt)  2 — (dr)2  < 0 
and  thus  As  is  imaginary  and  is  called  space  like.  A space- 
like plane  hypersurface  in  spatial  coordinates  is  shown  for  the 
present  time  in  the  unprimed  frame.  A rotation  in  Minkowski 
space  can  be  made  to  s'  such  that  the  space-like  hypersurface 
now  is  tilted  relative  to  the  hypersurface  shown  and  thus  any 
point  P outside  the  light  cone  can  be  made  to  occur  later, 
simultaneous,  or  earlier  than  at  the  vertex  depending  on  the 
orientation  of  the  space-like  hypersurface.  This  startling  situation  implies  that  the  time  ordering  of  two 
points,  each  outside  the  others  light  cone,  can  be  reversed  which  has  profound  implications  related  to  the 
concept  of  simultaneity  and  the  notion  of  causality. 

For  the  special  case  of  two  events  lying  on  the  light  cone  a:2  = c2t2  — (a:2  + a;2  + 2§)  = 0 and  thus 
these  events  are  separated  by  a light  ray  travelling  at  velocity  c.  Only  events  separated  by  time-like  intervals 
can  be  connected  causally.  The  world  line  of  a particle  must  lie  within  its  light  cone.  The  division  of  intervals 
into  space-like  and  time-like,  because  of  their  invariance,  is  an  absolute  concept.  That  is,  it  is  independent 
of  the  frame  of  reference. 

The  concept  of  proper  time  can  be  expanded  by  considering  a clock  at  rest  in  frame  S'  which  is  moving 
with  uniform  velocity  v with  respect  to  a rest  frame  S.  The  clock  at  rest  in  the  S'  frame  measures  the  proper 


Figure  16.6:  The  light  cone  in  the 

ct , x\ , 22  space  is  defined  by  the  condition 
X • X =c2t2  — r2  = 0 and  divides  space-time 
into  the  forward  and  backward  light  cones, 
with  t > 0 and  t < 0 respectively;  the  interi- 
ors of  the  forward  and  backward  light  cones 
are  called  absolute  future  and  absolute  past. 
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time  r,  then  the  time  observed  in  the  fixed  frame  can  be  obtained  by  looking  at  the  interval  ds.  Because  of 
the  invariance  of  the  interval,  ds 2 then 

ds 2 = c2dr2  = c2dt2  — \dx\  + da:2  + dxf\  (16.44) 


That  is, 


dr  = dt 


(dx\  + dx  2 + dx  3) 
c2dt 2 


dt 

7 


(16.45) 


that  is  dt  = 'ydr  which  satisfies  the  normal  expression  for  time  dilation,  16.8. 


16.5.4  Momentum-energy  four  vector 

The  previous  four-vector  discussion  can  be  elegantly  exploited  using  the  covariant  Minkowski  space-time 
representation.  Separating  the  spatial  and  time  of  the  differential  four  vector  gives 

dX  = (■ cdt , dx)  (16.46) 

Remember  that  the  square  of  the  four-dimensional  space-time  element  of  length  (ds)2  is  invariant  (16.43), 
and  is  simply  related  to  the  proper  time  element  dr.  Thus  the  scalar  product 

dX-dX  = ds 2 = c2dr2  = c2dt2  — [da:2  + dx  \ + dx 3]  (16.47) 

Thus  the  proper  time  is  an  invariant. 

The  ratio  of  the  four-vector  element  dX  and  the  invariant  proper  time  interval  dr,  is  a four-vector  called 
the  four-vector  velocity  U where 


( dt  dx\ 

V dr’ dr/ 


= lu  (c  u) 


(16.48) 


where  u is  the  particle  velocity,  and  7U  = — 7===. 

The  four-vector  momentum  P can  be  obtained  from  the  four-vector  velocity  by  multiplying  it  by  the 
scalar  rest  mass  m 

P = mU  = (7umc,7„mu)  (16.49) 

However, 


lumc  = 


(16.50) 


thus  the  momentum  four  vector  can  be  written  as 


(16.51) 


where  the  vector  p represents  the  three  spatial  components  of  the  relativistic  momentum.  It  is  interesting  to 
realize  that  the  Theory  of  Relativity  couples  not  only  the  spatial  and  time  coordinates,  but  also,  it  couples 
their  conjugate  variables  linear  momentum  p and  total  energy,  (X 

An  additional  feature  of  this  momentum-energy  four  vector  P,  is  that  the  scalar  inner  product  P • P is 
invariant  to  Lorentz  transformations  and  equals  (me)2  in  the  rest  frame 

33  33  F 

9^PV  = XX  P»PV  = (~?  - |p|2  = m2c2  (16.52) 

fi— 0 is—0  fi—0  u— 0 

which  leads  to  the  well-known  equation 

E 2 = p2c 2 + E2  (16.53) 

The  Lorentz  transformation  matrix  A can  be  applied  to  P 

P'  = AP  (16.54) 

The  Lorentz  invariant  four-vector  representation  is  illustrated  by  applying  the  Lorentz  transformation 
shown  in  figure  16.1,  which  gives,  p\  = 7 (pi  — (|)~  e\ , p'2  = p-2,  p'3  = P3,  and  E'  = 7 (E  — vp±). 
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16.6  Lorentz-invariant  formulation  of  Lagrangian  mechanics 

16.6.1  Parametric  formulation 

The  Lagrangian  and  Hamiltonian  formalisms  in  classical  mechanics  are  based  on  the  Newtonian  concept 
of  absolute  time  t which  serves  as  the  system  evolution  parameter  in  Hamilton’s  Principle.  This  approach 
violates  the  Special  Theory  of  Relativity.  The  extended  Lagrangian  and  Hamiltonian  formalism  is  a para- 
metric approach,  pioneered  by  Lanczos[La49],  that  introduces  a system  evolution  parameter  s that  serves 
as  the  independent  variable  in  the  action  integral,  and  all  the  space-time  variables  g.;(s),f(s)  are  dependent 
on  the  evolution  parameter  s.  This  extended  Lagrangian  and  Hamiltonian  formalism  renders  it  to  a form 
that  is  compatible  with  the  Special  Theory  of  Relativity.  The  importance  of  the  Lorentz-invariant  extended 
formulation  of  Lagrangian  and  Hamiltonian  mechanics  has  been  recognized  for  decades.  [La49,  Go50,  Sy60] 
Recently  there  has  been  a resurgence  of  interest  in  the  extended  Lagrangian  and  Hamiltonian  formalism 
stimulated  by  the  papers  of  Struckmeier[Str05,  Str08]  and  this  formalism  has  featured  prominently  in  recent 
textbooks  by  Johns[Jo05]  and  Greiner [GrlO].  This  parametric  approach  develops  manifestly-covariant  La- 
grangian and  Hamiltonian  formalisms  that  treat  equally  all  2n  + l space-time  canonical  variables.  It  provides 
a plausible  manifestly-covariant  Lagrangian  for  the  one-body  system,  but  serious  problems  exist  extending 
this  to  the  iV-body  system  when  TV  > 1.  Generalizing  the  Lagrangian  and  Hamiltonian  formalisms  into  the 
domain  of  the  Special  Theory  of  Relativity  is  of  fundamental  importance  to  physics,  while  the  parametric 
approach  gives  insight  into  the  philosophy  underlying  use  of  variational  methods  in  classical  mechanics.3 

In  conventional  Lagrangian  mechanics,  the  equations  of  motion  for  the  n generalized  coordinates  are 
derived  by  minimizing  the  action  integral,  that  is,  Hamilton’s  Principle. 

5S(q,ci,t)  = 5 f T(q(f),  q(t),t)dt  = 0 (16.55) 

J a 

where  L(q(f),  q(f),f)  denotes  the  conventional  Lagrangian.  This  approach  implicitly  assumes  the  Newtonian 
concept  of  absolute  time  t which  is  chosen  to  be  the  independent  variable  that  characterizes  the  evolution 
parameter  of  the  system.  The  actual  path  [q(i),  q(t)]  the  system  follows  is  defined  by  the  extremum  of  the 
action  integral  5(q,  q,t)  which  leads  to  the  corresponding  Euler-Lagrange  equations.  This  assumption  is 
contrary  to  the  Theory  of  Relativity  which  requires  that  the  space  and  time  variables  be  treated  equally, 
that  is,  the  Lagrangian  formalism  must  be  covariant. 

16.6.2  Extended  Lagrangian 

Lanczos[La49]  proposed  making  the  Lagrangian  covariant  by  introducing  a general  evolution  parameter  s, 
and  treating  the  time  as  a dependent  variable  t(s)  on  an  equal  footing  with  the  configuration  space  variables 
qt(s).  That  is,  the  time  becomes  a dependent  variable  qo(s)  = ct(s)  similar  to  the  spatial  variables  qfl(s) 
where  1 < fi  < n.  The  dynamical  system  then  is  described  as  motion  confined  to  a hypersurface  within  an 
extended  space  where  the  value  of  the  extended  Hamiltonian  and  the  evolution  parameter  s constitute  an 
additional  pair  of  canonically  conjugate  variables  in  the  extended  space.  That  is,  the  canonical  momentum 
Po,  corresponding  to  qo  = ct,  is  po  = ® similar  to  the  momentum-energy  four  vector,  equation  16.51. 

An  extended  Lagrangian  L(q(s),^j^,f(s),^r^)  can  be  defined  which  can  be  written  compactly  as 
h(qfl (s) ,dq^s^ ) where  the  index  0 < p < n denotes  the  entire  range  of  space-time  variables. 

This  extended  Lagrangian  can  be  used  in  an  extended  action  functional  §(q ,^,t,  to  give  an  extended 
version  of  Hamilton’s  Principle4 


^)  = S J Hq'i(s),dqd^)ds  = 0 (16.56) 


3 Chapters  16.6  and  16.7  reproduce  the  Struckmeier  presentation.  [Str08] 

4 These  formula  involve  total  and  partial  derivatives  with  respect  to  both  time,  t and  parameter  s.  For  clarity,  the  derivatives 
are  written  out  in  full  because  Lanczos[La49]  and  Johns[Jo05]  use  the  opposite  convention  for  the  dot  and  prime  superscripts 
as  abbreviations  for  the  differentials  with  respect  to  t and  s.  The  blackboard  bold  format  is  used  to  designate  the  extended 
versions  of  the  action  S,  Lagrangian  L and  Hamiltonian  HI. 
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The  conventional  action  S,  and  extended  action  §,  address  alternate  characterizations  of  the  same  underlying 
physical  system,  and  thus  the  action  principle  implies  that  SS  = SS  = 0 must  hold  simultaneously.  That  is, 

iJ  L<q'SA  %)d‘  (i6-57) 

As  discussed  in  chapter  13.3,  there  is  a continuous  spectrum  of  equivalent  gauge-invariant  Lagrangians  for 
which  the  Euler-Lagrange  equations  lead  to  identical  equations  of  motion.  Equation  16.57  is  satisfied  if  the 
conventional  and  extended  Lagrangians  are  related  by 


w , dq  dt.  r . dq  .dt 


dA(q,t) 

ds 


(16.58) 


where  A(q,t)  is  a continuous  function  of  q and  t that  has  continuous  second  derivatives.  It  is  acceptable  to 
assume  that  = 0,  then  the  extended  and  conventional  Lagrangians  have  a unique  relation  requiring 

no  simultaneous  transformation  of  the  dynamical  variables.  That  is,  assume 


w . da  dt.  , . da  .dt 


Note  that  the  time  derivative  of  q can  be  expressed  in  terms  of  the  s derivatives  by 


(16.59) 


dq  dq/ds 
dt  dt/ds 


(16.60) 


Thus,  for  a conventional  Lagrangian  with  n variables,  the  corresponding  extended  Lagrangian  is  a function 
of  n + 1 variables  while  the  conventional  and  extended  Lagrangians  are  related  using  equations  16.59,  and 
16.60. 

The  derivatives  of  the  relation  between  the  extended  and  conventional  Lagrangians  lead  to 


dh 


dqv 

dh 

~dt 


d 


dh 

»(i) 


dL  dt 
dq ^ ds 
dL  dt 
dt  ds 
dL 


dL  dq11 


dt 


-a(¥) 


(16.61) 

(16.62) 

(16.63) 

(16.64) 


where  1 < /z  < n since  the  /i  = 0 time  derivatives  are  written  explicitly  in  equations  16.62, 16.64. 

Equations  16.63  - 16.64,  summed  over  the  extended  range  0 < /j,  < n of  time  and  spatial  dynamical 
variables,  imply 


n 


y 


dt  dL  dq M dt 

ds  a ( dqC\  dt  ds 

V=1  U \ dt  ) 


+E 


dL  dqV 
d (*$-)-** 


= L 


Equation  16.65  can  be  written  in  the  form 


n 


l-E 

fi—0 


dh  dq» 


^0  if  L is  not  homogeneous  in 
= 0 if  L is  homogeneous  in 


(16.65) 


(16.66) 


If  the  extended  Lagrangian  L(q ,j*,t,  J|)  is  homogeneous  to  first  order  in  the  n + 1 variables  then  Euler’s 
theorem  on  homogeneous  function  trivially  implies  the  relation  given  in  equation  16.66.  Struckmeier[Str08] 
identified  a subtle  but  important  point  that  if  L is  not  homogeneous  in  =$—,  then  equation  16.66  is  not  an 
identity  but  is  an  implicit  equation  that  is  always  satisfied  as  the  system  evolves  according  to  the  solution 
of  the  extended  Euler-Lagrange  equations.  Then  equation  16.59  is  satisfied  without  it  being  a homogeneous 
form  in  the  n+1  velocities  This  introduces  a new  class  of  non- homogeneous  Lagrangians.  The  relativistic 
free  particle,  discussed  in  example  16.5,  is  a case  of  a non-homogeneous  extended  Lagrangian. 
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16.6.3  Extended  generalized  momenta 


The  generalized  momentum  is  defined  by 

dL 


(16.67) 


Assume  that  the  definitions  of  the  extended  Lagrangian  L,  and  the  extended  Hamiltonian  H,  are  related 
by  a Legendre  transformation,  and  are  based  on  variational  principles,  analogous  to  the  relation  that  exists 
between  the  conventional  Lagrangian  L and  Hamiltonian  H.  The  Legendre  transformation  requires  defining 
the  extended  generalized  (canonical)  momentum-energy  four  vector  P(s)=  (^yp,p(s)).  The  momentum 
components  of  the  momentum-energy  four  vector  P(s)=  (^) , p(s))  are  given  by  the  1 < p < n components 
using  equation  16.63. 


dL  _ dL 

a(¥)  ”«(¥) 


(16.68) 


The  p = 0 component  of  the  momentum-energy  four  vector  can  be  derived  by  recognizing  that  the  right-hand 
side  of  equation  16.64  is  equal  to  —H(p^,q^,t).  That  is,  the  corresponding  generalized  momentum  p0,  that 
is  conjugate  to  q0  = ct,  is  given  by 


Po  = 


dL 


'(¥) 


(16.69) 


16.6.4  Extended  Lagrange  equations  of  motion 

By  direct  analogy  with  the  non-relativistic  action  integral  16.55,  the  extremum  for  the  relativistic  action 
integral  §(q ,^,t,  J|)  js  obtained  using  the  Euler-Lagrange  equations  derived  from  equation  16.56  where  the 
independent  variable  is  s.  This  implies  that  for  0 < p < n 


d_ 

ds 


dL 

dqv 


iU  = Y.^$i+QUcdt 


ds 

fc= l 


1 dqv 


ds 


(16.70) 


where  the  extended  generalized  force  Q^A  shown  on  the  right-hand  side  of  equation  16.70,  accounts  for  all 
forces  not  included  in  the  potential  energy  term  in  the  Lagrangian.  The  extended  generalized  force  Q^A  can 
be  factored  into  two  terms  as  discussed  in  chapter  6,  equation  6.47.  The  Lagrange  multiplier  term  includes 
1 < k < m holonomic  constraint  forces  where  the  m holonomic  constraints,  which  do  no  work,  are  expressed 
in  terms  of  the  m algebraic  equations  of  holonomic  constraint  g The  Q^xc  term  includes  the  remaining 
constraint  forces  and  generalized  forces  that  are  not  included  in  the  Lagrange  multiplier  term  or  the  potential 
energy  term  of  the  Lagrangian. 

For  the  case  where  p = 0,  since  qo  = ct,  then  equation  16.70  reduces  to 


d ( dL  \ dL  _ dt  dg & „ exc  ^ 

~ds  laTF) ) ~~di-  ^Ts  k~di~  ~di 

\w  \ ds  J / k=  1 v=\ 


(16.71) 


These  Euler-Lagrange  equations  of  motion  16.70, 16.71  determine  the  1 < p < n generalized  coordinates 
q^(s),  plus  q°  = ct(s)  in  terms  of  the  independent  variable  s. 

If  the  holonomic  equations  of  constraint  are  time  independent,  that  is  = 0 and  if  QqXC  = 0,  then 
the  p = 0 term  of  the  Euler-Lagrange  equations  simplifies  to 


d_  f dL  \ _ dL 

dsV5(sf)/  dt 


(16.72) 


One  interpretation  is  to  select  L to  be  primary.  Then  L is  derived  from  L using  equation  16.59  and  L 
must  satisfy  the  identity  given  by  equation  16.66  while  the  Euler-Lagrange  equations  containing  yield  an 
identity  which  implies  that  L does  not  provide  an  equation  of  motion  in  terms  of  t(s).  Conversely,  if  L is 
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chosen  to  be  primary,  then  L is  no  longer  a homogeneous  function  and  equation  16.66  serves  as  a constraint 
on  the  motion  that  can  be  used  to  deduce  L,  while  4^  yields  a non-trivial  equation  of  motion  in  terms  of 
t(s).  In  both  cases  the  occurrence  of  a constraint  surface  results  from  the  fact  that  the  extended  space  has 
2n  + 2 variables  to  describe  2n  + 1 degrees  of  freedom,  that  is,  one  more  degree  of  freedom  than  required  for 
the  actual  system. 


16.5  Example:  Lagrangian  for  a relativistic  free  particle 

The  standard  Lagrangian  L = T — U is  not  Lorentz  invariant.  The  extended  Lagrangian  L(q,  ^,t, 
introduces  the  independent  variable  s which  treats  both  the  space  variables  q(s)  and  time  variable  q0  = ct(s) 
equally.  This  can  be  achieved  by  defining  the  non-standard  Lagrangian 


dq  dt  1 2 


1 / dq\~  ,dt  2 
c2  I ds  ) ds 


(a) 


The  constant  third  term  in  the  bracket  is  included  to  ensure  that  the  extended  Lagrangian  converges  to  the 
standard  Lagrangian  in  the  limit  — > 1. 

Note  that  the  extended  Lagrangian  ( a ) is  not  homogeneous  to  first  order  in  the  velocities  ^ as  is  required. 
Equation  16.66  must  be  used  to  ensure  that  equation  ( a ) is  homogeneous.  That  is,  it  must  satisfy  the 
constraint  relation 


1 / riq\ 
c2  yds/ 


-1  = 0 


OS) 


Inserting  (/3)  into  the  extended  Lagrangian  (a)  yields  that  the  square  bracket  in  equation  a must  equal  2. 
Thus 

|L|  = -me2  [~ 2]  = —me2  (7) 

The  constraint  equation  (/?)  implies  that 


ds  l~  1 —( dq\2  1 
dt  y c2  \dt ) 7 


(S) 


Using  equation  (<5)  gives  that  the  relativistic  Lagrangian  is 


L = 


L 

7 


(e) 


Equation  (e)  is  the  conventional  relativistic  Lagrangian  derived  by  assuming  that  the  system  evolution  para- 
meter s is  transformed  to  be  along  the  world  line  ds,  where  the  invariant  length  ds  replaces  the  proper  time 
interval 

ds  = edr  = — (e) 


The  definition  of  the  generalized  (canonical)  momentum 


d L 

Pi  = — 7 mqi 

defi 


(C) 


leads  to  the  relativistic  expression  for  momentum  given  in  equation  16.21. 

The  relativistic  Lagrangian  is  an  important  example  of  a non-standard  Lagrangian.  Equation  (a)  does  not 
equal  the  difference  between  the  kinetic  and  potential  energies,  that  is,  the  relativistic  expression  for  kinetic 
energy  is  qiven  by  16.28  to  be 

T = (7  - 1)  me2  (n) 

The  non-standard  relativistic  Lagrangian  (e)  can  be  used  with  the  Eider- Lagrange  equations  to  derive  the 
second-order  equations  of  motion  for  both  relativistic  and  non-relativistic  problems  within  the  Special  Theory 
of  Relativity. 
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16.6  Example:  Relativistic  particle  in  an  external  electromagnetic  field 

A charged  particle  moving  at  relativistic  speed  in  an  external  electromagnetic  field  provides  an  example 
of  the  use  of  the  relativistic  Lagrangian. 

In  the  discussion  of  classical  mechanics  it  was  shown  that  the  velocity-dependent  Lorentz  force  can  be 
absoi'bed  into  the  scalar  electric  potential  $ plus  the  vector  magnetic  potential  A.  That  is,  the  potential 
energy  is  given  by  equation  7.6  to  be  U = q(Q  — A • v).  Including  this  in  the  Lagrangian,  16.71,  gives 

L = — — U = — me 2 \/l  — /32  — q( I>  + qA  ■ v 

The  three  spatial  partial  derivatives  can  be  written  in  vector  notation  as 

^ = —qVQ  + -V(v  • A)  (a) 

or  c 

and  the  generalized  momentum  is  given  by 


dL 

p = = 7 mv  + qA 

dv 

which  is  identical  to  the  non-relativistic  answer  given  by  equation  7.6.  That  is,  it  includes  the  momentum  of 
the  electromagnetic  field  plus  the  classical  linear  momentum  of  the  moving  particle. 

The  total  time  derivative  of  the  generalized  momentum  is 


rip  d / dL\ 
dt  dt  \dv  ) 


d_ 

dt 


(7rnv)  + q 


dA 

dt 


(b) 


where  the  last  term  is  given  by  the  chain  rule 

dA  dA 


dt  dt 

Using  equations  a,  b , c in  the  Eider- Lagrange  equation  gives 


+ (v  ■ V)A 


(c) 


d ( dL 


dt  V dv 


dL 

dr 


d , , dA  _. 

— (7 mv)  + q—  = -qV$  + qV(v-  A) 

Collecting  terms  and  using  the  well-known  vector-product  identity,  plus  the  definition  B = V x A,  gives 


-(7mv)  = 


dA 

,V<t  - , gj- 


= -9 


V4> 


dA 

~dt 


+ g[V(v-  A)  — (v- V)A] 
f q [v  x V x A] 


F = g[E  + vxB] 


If  we  adopt  the  definition  that  the  relativistic  canonical  momentum  is  p = 7 mv  then  the  left  hand  side  is 
the  relativistic  force  while  the  right-hand  side  is  the  well-known  Lorentz  force  of  electromagnetism.  Thus 
the  extended  Lagrangian  formidation  correctly  reproduces  the  well-known  Lorentz  force  for  a charged  particle 
moving  in  an  electromagnetic  field. 
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16.7  Lorentz-invariant  formulations  of  Hamiltonian  mechanics 

16.7.1  Extended  canonical  formalism 

A Lorentz-invariant  formulation  of  Hamiltonian  mechanics  can  be  developed  that  is  built  upon  the  extended 
Lagrangian  formalism  assuming  that  the  Hamiltonian  and  Lagrangian  are  related  by  a Legendre  transfor- 
mation. That  is, 


M=1 


where  the  generalized  momentum  is  defined  by 


Pm  = 


dL 


(¥) 


d 


(16.73) 


(16.74) 


Struckmeier[Str08]  assumes  that  the  definitions  of  the  extended  Lagrangian  L,  and  the  extended  Hamil- 
tonian H,  are  related  by  a Legendre  transformation,  and  are  based  on  variational  principles,  analogous  to  the 
relation  that  exists  between  the  conventional  Lagrangian  L and  Hamiltonian  H.  The  Legendre  transforma- 
tion requires  defining  the  extended  generalized  (canonical)  momentum-energy  four  vector  P(s)=  (^^-,  p(s)). 
The  momentum  components  of  the  momentum-energy  four  vector  P(s)=  (Msi,p(s))  are  given  by  the  1 < 
fj.  < n components  using  either  the  conventional  or  the  extended  Lagrangians  as  given  in  equation  16.68 


Pm(s)  = 


dh 


dL 


a (If)  a(¥) 

The  p.  = 0 component  of  the  momentum-energy  four  vector  is  given  by  equation  16.69 


(16.68) 


Po 


<9L 

mu 


Hip^q^t)  _ £(s) 


(16.75) 


where  £(s)  represents  the  instantaneous  generalized  energy  of  the  conventional  Hamiltonian  at  the  point  s, 
but  not  the  functional  form  of  H(q(s),  p(s),  That  is 


£(s)iH(q(s),p(s),t(s)) 

Note  that  £(s)  does  not  give  the  function  H(q,p,t).  Equations  16.68  and  16.69  give  that 

Ms)  = ATI 


(16.76) 


(16.77) 


The  extended  Hamiltonian  H(q,  p,t, £(s)),  in  an  extended  phase  space,  can  be  defined  by  the  Legendre 
transformation  and  the  four-vector  P to  be 


M(q,p, t,£(s))  = (P-q) -L(q,^,f, 

dqv 


(16.78) 


fi—0 


H=1 


El  dq^  \ ul  uq  Ub 


El  aT  \ it  i , dt . 


ds’  ’ ds ' 


ds 


dt 


dq  dt , 


ds 


ds  ds' 


(16.79) 


where  the  po  term  has  been  written  explicitly  as  — £jr-  in  equation  16.79.  The  extended  Hamiltonian 
H((q,  p,  t,  £(s))  can  carry  all  the  information  on  the  dynamical  system  that  is  carried  by  the  extended 
Lagrangian  L(q,^,t,  4^).  if  the  Hesse  matrix  is  non-singular.  That  is,  if 


clet 


d2L 


y(¥)a(*) 


#0 


(16.80) 
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If  the  extended  Lagrangian  L(q,^,£,  is  not  homogeneous  in  the  n+1  velocities  ^-,  then  the  extended 
set  of  Euler-Lagrange  equations  16.72  is  not  redundant.  Thus  equation  16.66  is  not  an  identity  but  it  can  be 
regarded  as  an  implicit  equation  that  is  always  satisfied  by  the  extended  set  of  Euler-Lagrange  equations.  As 
a result,  the  Legendre  transformation  to  an  extended  Hamiltonian  exists.  That  is,  equation  16.66  is  identical 
to  the  Legendre  transform  for  IHI((q,  p,  t,  £(s))  which  was  shown  to  equal  zero.  Therefore 


H(q(s),p(s),t(s),£(s))  = 0 


(16.81) 


which  means  that  the  extended  Hamiltonian  IHI((q,  p,  t,  £{s))  directly  defines  the  restricted  hypersurface  on 
which  the  particle  motion  is  confined. 

The  extended  canonical  equations  of  motion,  derived  using  the  extended  Hamiltonian  IHI(q(s),  p(s),  t(s),£(s)) 
with  the  usual  Hamiltonian  mechanics  relations,  are: 


dH 

dPn 

dq^ 

ds 

(16.82) 

dm 
dq v 

dPu 

ds 

(16.83) 

dm 

d£ 

(16.84) 

~dt 

ds 

dm 

dt 

(16.85) 

~d£ 

ds 

These  canonical  equations  give  that  the  total  derivative  of  H((q(s),  p(s),  t(s),  £(s))  with  respect  to  s,  is 


dM 

cffl  dp^ 

<9H  dq M 

dmdt 

drnd£ 

ds 

dp M ds 

' dqv  ds 

' dt  ds 

' d£  ds 

dq ^ dpp 

dp„  dq ^ 

d£  dt 

dt  d£ 

ds  ds 

ds  ds 

ds  ds 

ds  ds 

That  is,  in  contrast  to  the  total  time  derivative  of  LT(q,  p,  f),  the  total  s derivative  of  the  extended  Hamil- 
tonian H((q(s),  p(s),  t(s),  £(s))  always  vanishes,  that  is,  H(q(s),  p(s),  t(s),  £(s))  is  autonomous  which  is  ideal 
for  use  with  Hamilton’s  equations  of  motion.  The  constraints  give  that  H(q(s),  p(s),  t(s),  £{s))  = 0,  (equation 
16.81)  and  = 0,  (equation  16.86)  implying  that  the  correlation  between  the  extended  and  conventional 
Hamiltonians  is  given  by 


®I((q(s),p(s),£(s),£(s)) 


n 


/*=i 


dt_ 

ds 


„ . do  dt . 


(16.87) 


= 

\j>— i 

n 

= 

ij= i 

= (tf(q,p,t)  - £)  ^ = 0 (16.90) 

since  only  the  term  with  /j,  = 0 does  not  cancel  in  equation  16.79.  Equations  16.81  and  16.90  give  that  both  the 
left  and  right-hand  sides  of  equation  16.90  are  zero  while  equation  16.86  implies  that  H((q(s),  p (s),  t(s),£(s)) 
is  a constant  of  motion,  that  is,  s is  a cyclic  variable  for  H((q(s),  p(s),  t(s),  £(s)).  Formally  one  can  consider 
the  extended  Hamiltonian  is  a constant  which  equals  zero 


dqv 

ds 

dq^ 

ds 


dt  dq  dt 


_ pdt_ 
ds 


M=  1 


dt 

ds 


(16.88) 

(16.89) 


H(q,  p,  t,£(s))  = E(s)  = 0 (16.91) 

Equations  16.84,16.85  imply  that  (£,t)  form  a pair  of  canonically  conjugate  variables  in  addition  to  the 
newly-introduced  canonically-conjugate  variables  (E(s),s).  Equation  16.90  shows  that  the  motion  in  the 
2n  + 2 extended  phase  space  is  constrained  to  the  surface  reflecting  the  fact  that  the  observed  system  has 
one  less  degree  of  freedom  than  used  by  the  extended  Hamiltonian. 

In  summary,  the  Lorentz- invariant  extended  canonical  formalism  leads  to  Hamilton’s  first-order  equations 
of  motion  in  terms  of  derivatives  with  respect  to  s,  where  s is  related  to  the  proper  time  r for  a relativistic 
system. 
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16.7.2  Extended  Poisson  Bracket  representation 


Struckmeier[Str08]  investigated  the  usefulness  of  the  extended  formalism  when  applied  to  the  Poisson  bracket 
representation  of  Hamiltonian  mechanics.  The  extended  Poisson  bracket  for  two  differentiable  functions  F 
and  G is  defined  as 


p^]]  = E 

3=  1 


dF  dG 
dqi  dpj 


d F dG\  OF  dG  dF  dG 
WjWJ  ~~  ~dtdH  + dH  ~dt 


(16.92) 


As  for  the  conventional  Poisson  bracket  discussed  in  chapter  14,  the  extended  Poisson  also  leads  to  the 
fundamental  Poisson  bracket  relations 


[[q\qj]]=  0 [[Pi,Pj}}  = 0 [[q\pj]\=S)  (16.93) 


where  i,j  = 0, 1,  These  are  identical  to  the  non-extended  fundamental  Poisson  brackets. 

The  discussion  of  observables  in  Hamiltonian  mechanics  in  chapter  14.3.4  can  be  trivially  expanded  to 
the  extended  Poisson  bracket  representation.  In  particular,  the  total  s derivative  of  the  function  G is  given 

f = f + [[G.H]]  (16.94) 

If  G commutes  with  the  extended  Hamiltonian,  that  is,  the  Poisson  bracket  equals  zero,  and  if  Hjf  = 0,  then 
= 0.  That  is,  the  observable  G is  a constant  of  motion. 

Substitute  the  fundamental  variables  for  G gives 


ce 

dqv 


dq M 
ds 


(16.95) 


where  i,j  = 0, 1,  ...,n.  These  are  Hamilton’s  extended  canonical  equations  of  motion  expressed  in  terms  of 
the  system  evolution  parameter  s.  The  extended  Poisson  bracket  representation  is  a trivial  extension  of  the 
conventional  canonical  equations  presented  in  chapter  14.3. 


16.7.3  Extended  canonical  transformation  and  Hamilton- Jacobi  theory 

Struckmeier[Str08]  presented  plausible  extended  versions  of  canonical  transformation  and  Hamilton-Jacobi 
theories  that  can  be  used  to  provide  a Lorentz-invariant  formulation  of  Hamiltonian  mechanics  for  relativistic 
one-body  systems.  A detailed  description  can  be  found  in  Struckmeier[Str08].5 

16.7.4  Validity  of  the  extended  Hamilton-Lagrange  formalism 

It  has  been  shown  that  the  extended  Lagrangian  and  Hamiltonian  formalism,  based  on  the  parametric  model 
of  Lanczos[La.49],  leads  to  a plausible  manifestly-covariant  approach  for  the  one-body  system.  The  general 
features  developed  for  handling  Lagrangian  and  Hamiltonian  mechanics  carry  over  to  the  Special  Theory 
of  Relativity  assuming  the  use  of  a non-standard,  extended  Lagrangian  or  Hamiltonian.  This  expansion  of 
the  range  of  validity  of  the  well-known  Hamiltonian  and  Lagrangian  mechanics  into  the  relativistic  domain 
is  important,  and  reduces  any  Lorentz  transformation  to  a canonical  transformation.  The  validity  of  this 
extended  Hamilton-Lagrange  formalism  has  been  criticized,  and  problems  exist  extending  this  approach  to 
the  iV-body  system  for  N > 1.  For  example,  as  discussed  by  Goldstein[Go50]  and  Johns[Jo05],  each  of 
the  N moving  bodies  have  their  own  world  lines  and  momenta.  Defining  the  total  momentum  P requires 
knowing  simultaneously  the  momenta  of  the  individual  bodies,  but  simultaneity  is  body  dependent  and 
thus  even  the  total  momentum  is  not  a simple  four  vector.  A general  method  is  required  that  will  allow 
using  a manifestly-covariant  Lagrangian  or  Hamiltonian  for  the  iV-body  system.  For  the  one-body  system, 
the  extended  Hamilton-Lagrange  formalism  provides  a powerful  and  logical  approach  to  exploit  analytical 
mechanics  in  the  relativistic  domain  that  retains  the  form  of  the  conventional  Lagrangian/Hamiltonian 
formalisms.  Note  that  Noether’s  theorem  relating  energy  and  time  is  readily  apparent  using  the  extended 
formalism. 


5Note  that  Greiner[GrlO]  includes  a reproduction  of  the  Strucknreier  paper[Str08]. 
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16.7  Example:  The  Bohr-Sommerfeld  hydrogen  atom 

The  classical  relativistic  hydrogen  atom  was  first  solved  by  Sommerfeld  in  1916.  Sommerfeld  used  Bohr’s 
"old  quantum  theory " plus  Hamiltonian  mechanics  to  make  an  important  step  in  the  development  of  quantum 
mechanics  by  obtaining  the  first-order  expressions  for  the  fine  structure  of  the  hydrogen  atom.  As  in  the 
non-relativistic  case,  the  motion  is  confined  to  a plane  allowing  use  of  planar  polar  coordinates.  Thus  the 
relativistic  Lagrangian  is  given  by 


L 


me 


— U = —me" 


r29 


ke2 


The  canonical  momenta  are  given  by 


Pe 

Pr 

P9 

Pr 


dL 

~dO 

dL 

dr 
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= myr29 


= myr 


= 0 

• 2 C2 

= myr  9 +k—^ 


As  for  the  non-relativistic  case,  9 is  a cyclic  variable  and  thus  the 
angular  momentum  pg  = myr29  is  conserved. 

The  relativistic  Hamiltonian  for  the  Coulomb  potential  between  an 
electron  and  the  proton,  assuming  that  the  motion  is  confined  to  a 
plane,  which  allows  use  of  planar  polar  coordinates,  leads  to 


H = 


+ m2ci 


ke2 

r 


The  advance  of  the  perihelion  of 
bound  orbits  due  to  the  dependence 
of  the  relativistic  mass  on  velocity. 


The  same  equations  of  motion  are  obtained  using  Hamiltonian  mechanics,  that  is: 

q = dH  = Pe 
dpg  myr2 
dH  pr 
dpr  my 


Pe 

Pr 


dJL  = o 

de 

dH  -2  e2 

— = myrO  + k— 
dr  rz 


The  radial  dependence  can  be  solved  using  either  Lagrangian  or  Hamiltonian  mechanics,  but  the  solution 
is  non-trivial.  Using  the  same  techniques  applied  to  solve  Kepler’s  problem,  leads  to  the  radial  solution 


q 

1 + ecos[T(0  — 90\ 


e = 


r2(i  - 


m2c4 
E2  ) 


l - r2 


The  apses  are  rm;n  = for  T(0  — Of)  = 0,  27t,47t,  and  rmax  = jyiyj  for  T(9  — 0O)  = n,3n,.  The 
perihelion  advances  between  cycles  due  to  the  change  in  relativistic  mass  during  the  trajectory  as  shown  in 
the  adjacent  figure.  This  precession  leads  to  the  fine  structure  observed  in  the  optical  spectra  of  the  hydrogen 
atom.  The  same  precession  of  the  perihelion  occurs  for  planetary  motion,  however,  there  is  a comparable 
size  effect  due  to  gravity  that  requires  use  of  general  relativity  to  compute  the  trajectories. 


478 


CHAPTER  16.  RELATIVISTIC  MECHANICS 


16.8  The  General  Theory  of  Relativity 

The  Special  Theory  of  Relativity  is  restricted  to  inertial  frames  that  are  in  uniform  non-accelerated  motion, 
and  are  assumed  to  exist  over  all  of  space-time.  In  1916  Einstein  published  the  General  Theory  of  Relativity 
which  expands  the  scope  of  relativistic  mechanics  to  include  non-inertial  accelerating  frames  plus  a unified 
theory  of  gravitation.  The  General  Theory  of  Relativity  incorporates  both  the  Special  Theory  of  Relativity 
as  well  as  Newton’s  Law  of  Universal  Gravitation.  It  provides  a unified  theory  of  gravitation  that  is  a 
geometric  property  of  space  and  time.  In  particular,  the  curvature  of  space-time  is  directly  related  to 
the  four-momentum  of  matter  and  radiation.  Unfortunately,  Einstein’s  equations  of  general  relativity  are 
nonlinear  partial  differential  equations  that  are  difficult  to  solve  exactly,  and  the  theory  requires  knowledge 
of  Riemannian  geometry  that  goes  beyond  the  scope  of  this  book.  However,  it  is  useful  to  summarize  the 
fundamental  concepts  upon  which  the  theory  is  based,  and  some  of  the  observable  implications  since  the 
General  Theory  of  Relativity  is  an  important  branch  of  classical  mechanics. 

16.8.1  Fundamental  concepts 

The  development  of  general  relativity  by  Einstein  was  strongly  influenced  by  the  following  five  principles. 

Mach’s  principle: 

The  1883  work  "The  Science  of  Mechanics"  by  the  philosopher /physicist,  Ernst  Mach,  criticized  Newton’s 
concept  of  an  absolute  frame  of  reference,  and  suggested  that  local  physical  laws  are  determined  by  the 
large-scale  structure  of  the  universe.  The  concept  is  that  local  motion  of  a rotating  frame  is  determined  by 
the  large-scale  distribution  of  matter,  that  is,  relative  to  the  fixed  stars.  Einstein’s  interpretation  of  Mach’s 
statement  was  that  the  inertial  properties  of  a body  is  determined  by  the  presence  of  other  bodies  in  the 
universe,  and  he  named  this  concept  Mach’s  Principle.  Mach’s  Principle  has  never  been  developed  into  a 
quantitative  physical  theory  that  would  explain  a mechanism  by  which  the  large-scale  distribution  of  matter 
can  produce  such  an  effect. 

Equivalence  principle: 

The  equivalence  principle  comprises  closely-related  concepts  dealing  with  the  equivalence  of  gravitational  and 
inertial  mass.  The  weak  equivalence  principle  states  that  the  inertial  mass  and  gravitational  mass  of  a 
body  are  identical,  leading  to  an  acceleration  that  is  independent  of  the  nature  of  the  body.  This  experimental 
fact  usually  is  attributed  to  Galileo.  Recent  measurements  have  shown  that  this  weak  equivalence  principle 
is  obeyed  to  a sensitivity  of  5 x 10-13.  Einstein’s  equivalence  principle  states  that  the  outcome  of 
any  local  non-gravitational  experiment,  in  a freely  falling  laboratory,  is  independent  of  the  velocity  of  the 
laboratory  and  its  location  in  space-time.  This  principle  implies  that  the  result  of  local  experiments  must  be 
independent  of  the  velocity  of  the  apparatus.  Einstein’s  equivalence  principle  has  been  tested  by  searching 
for  variations  of  dimensionless  fundamental  constants  such  as  the  fine  structure  constant.  The  strong 
equivalence  principle  combines  the  weak  equivalence  and  Einstein  equivalence  principles,  and  implies 
that  the  gravitational  constant  is  constant  everywhere  in  the  universe.  The  strong  equivalence  principle 
suggests  that  gravity  is  geometrical  in  nature  and  does  not  involve  any  fifth  force  in  nature.  Einstein’s 
General  Theory  of  Relativity  satisfies  the  strong  equivalence  principle.  Tests  of  the  strong  equivalence 
principle  have  involved  searches  for  variations  in  the  gravitational  constant  G and  masses  of  fundamental 
particles  throughout  the  life  of  the  universe. 

Principle  of  covariance 

A physical  law  expressed  in  a covariant  formulation  has  the  same  mathematical  form  in  all  coordinate  systems, 
and  is  usually  expressed  in  terms  of  tensor  fields.  Maxwell’s  equations  of  electromagnetism  are  an  example  of 
such  a covariant  formulation.  In  the  Special  Theory  of  Relativity,  the  Lorentz,  rotational,  translational  and 
reflection  transformations  between  inertial  coordinate  frames  all  are  covariant.  The  covariant  quantities  are 
the  4-scalars,  and  4- vectors  in  Minkowski  space-time.  Einstein  recognized  that  the  principle  of  covariance, 
that  is  built  into  the  Special  Theory  of  Relativity,  should  apply  equally  to  accelerated  relative  motion  in 
the  General  Theory  of  Relativity.  He  exploited  tensor  calculus  to  extend  the  Lorentz  covariance  to  the 
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more  general  local  covariance  in  the  General  Theory  of  Relativity.  The  reduction  locally  of  the  general 
metric  tensor  to  the  Minkowski  metric  corresponds  to  free-falling  motion,  that  is  geodesic  motion,  and  thus 
encompasses  gravitation.  Unified  field  theory  involves  attempts  to  extend  the  General  Theory  of  Relativity 
to  incorporate  other  physical  phenomena  within  a covariant  framework  in  a purely  geometric  representation 
in  space-time. 


Correspondence  principle 

The  Correspondence  Principle  states  that  the  predictions  of  any  new  scientific  theory  must  reduce  to  the  pre- 
dictions of  well  established  earlier  theories  under  circumstances  for  which  the  preceding  theory  was  known 
to  be  valid.  This  also  is  referred  to  as  the  "correspondence  limit".  The  Correspondence  Principle  is  an 
important  concept  used  both  in  quantum  mechanics  and  relativistic  mechanics.  Einstein’s  Special  Theory 
of  Relativity  satisfies  the  Correspondence  Principle  because  it  reduces  to  classical  mechanics  in  the  limit 
of  velocities  small  compared  to  the  speed  of  light.  The  Correspondence  Principle  requires  that  the  Gen- 
eral Theory  of  Relativity  must  reduce  to  the  Special  Theory  of  Relativity  for  inertial  frames,  and  should 
approximate  Newton’s  Theory  of  Gravitation  in  weak  fields  and  at  low  velocities. 


Principle  of  minimal  gravitational  coupling 

The  principle  of  minimal  gravitational  coupling  requires  that  the  total  Lagrangian  for  the  field  equations  of 
general  relativity  consist  of  two  additive  parts,  one  part  corresponding  to  the  free  gravitational  Lagrangian, 
and  the  other  part  to  external  source  fields  in  curved  space-time.  That  is,  no  terms  explicitly  containing  the 
curvature  of  space-time  should  be  added  in  the  extension  from  the  special  to  general  theories  of  relativity. 

16.8.2  Einstein’s  postulates  of  the  General  Theory  of  Relativity 

Einstein  realized  that  the  Equivalence  Principle  relating  the  gravitational  and  inertial  masses  implies  that 
the  constancy  of  the  velocity  of  light  in  vacuum  cannot  hold  in  the  presence  of  a gravitational  field.  That 
is,  the  Minkowskian  line  element  must  be  replaced  by  a more  general  line  element  that  takes  gravity  into 
account.  Einstein  proposed  that  the  Minkowskian  line  element  in  four-dimensional  space-time,  be  replaced 
by  introducing  a four-dimensional  Riemannian  geometrical  structure  where  space,  time,  and  matter  are  com- 
bined. As  described  by  Lancos[La49],  [Har03],  [Mu08]  this  astonishingly  bold  proposal  implies  that  planetary 
motion  is  described  as  purely  a geodesic  phenomenon  in  a certain  four-space  of  Riemannian  structure,  where 
the  geodesic  is  the  equation  of  a curve  on  a manifold  for  any  possible  set  of  coordinates.  This  implies  that 
the  concept  of  "gravitational  force"  is  discarded,  and  planetary  motion  is  a manifestation  of  a pure  geodesic 
phenomenon  for  forceless  motion  in  a four-dimensional  Riemannian  structure.  Chapter  5.10  showed  that  the 
Lagrangian  and  Hamiltonian  representations  of  variational  mechanics  are  powerful  approaches  for  determin- 
ing the  equation  governing  geodesic  constrained  motion.  In  addition,  these  representations  are  independent 
of  the  chosen  frame  of  reference  as  required  by  the  General  Theory  of  Relativity.  Thus  variational  mechanics 
is  the  preeminent  theoretical  representation  of  the  General  Theory  of  Relativity  and  the  predictions  are 
consistent  with  the  fundamental  concepts  described  in  chapter  16.8.1. 

To  summarize,  the  Special  Theory  of  Relativity  implies  that  the  Newtonian  concepts  of  absolute  frame 
of  reference  and  separation  of  space  and  time  are  invalid.  The  General  Theory  of  Relativity  goes  beyond 
the  Special  Theory  by  implying  that  the  gravitational  force,  and  the  resultant  planetary  motion,  can  be 
described  as  pure  geodesic  phenomena  for  forceless  motion  in  a four-dimensional  Riemannian  structure. 

16.8.3  Experimental  evidence 

The  evidence  in  support  of  Einstein’s  Theory  of  General  Relativity  is  compelling.  The  following  are  typical 
experimental  manifestations  of  the  General  Theory  of  Relativity. 


Kepler  problem  In  1915  Einstein  showed  that  relativistic  mechanics  explained  the  anomalous  advance 
of  the  perihelion  of  the  planet  mercury,  that  is,  the  axes  of  the  elliptical  Kepler  orbit  precess.  Example  16.1 
discusses  the  analogue  of  this  effect  for  the  Bohr-Sommerfeld  hydrogen  atom. 
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Deflection  of  light  Eddington  travelled  to  the  island  of  Principe,  near  Africa,  to  watch  the  solar  eclipse 
of  29  May  1919.  During  the  eclipse,  he  took  pictures  of  the  stars  in  the  region  around  the  Sun.  According 
to  the  theory  of  general  relativity,  stars  with  light  rays  that  passed  near  the  Sun  would  appear  to  have  been 
slightly  shifted  because  their  light  had  been  curved  by  the  sun’s  gravitational  field.  This  effect  is  noticeable 
only  during  eclipses,  since  otherwise  the  Sun’s  brightness  obscures  the  affected  stars.  The  results  confirmed 
Einstein’s  prediction  of  the  deflection  of  light  in  a gravitational  field  which  made  Einstein  famous. 

Gravitational  lensing  The  deflection  of  light  by  the  gravitational  at- 
traction of  a massive  object  situated  between  a distant  star  and  the  ob- 
server results  in  the  observation  of  multiple  images  of  the  distant  quasar 
shown  in  figure  16.7. 

Gravitational  time  dilation  and  frequency  shift  Processes  occur- 
ring in  a high  gravitation  field  are  slower  than  in  a weak  gravitational 
field;  this  is  called  gravitational  time  dilation.  In  addition,  light  climbing 
out  of  a gravitational  well  is  red  shifted.  The  gravitational  time  dilation 
has  been  measured  many  times  and  the  continued  operation  of  the  Global 
Position  System  provides  an  ongoing  validation.  The  gravitational  red 
shift  has  been  confirmed  in  the  laboratory  using  the  precise  Mossbauer 
effect  in  nuclear  physics.  Tests  in  stronger  gravitational  fields  are  pro- 
vided by  studies  of  binary  pulsars.  All  of  these  measurements  confirm 
the  general  theory  of  relativity. 

Gravitational  waves  Current  attempts  to  detect  gravitational  waves 
have  been  unsuccessful.  However,  in  1976  Hulse  and  Taylor  detected  a 
decrease  in  the  orbital  period  due  to  significant  energy  loss  associated 
with  emission  of  gravity  waves  by  the  very  compact  neutron  star  in  the 
binary  pulsar  PSR1913  + 16.  This  is  the  first  implied  detection  of  grav- 
itational waves. 

Black  holes  When  the  mass  to  radius  ratio  of  the  massive  object  becomes  sufficiently  large,  general 
relativity  predicts  formation  of  a black  hole,  which  is  a region  of  space  from  which  neither  light  nor  matter 
can  escape.  At  the  center  of  a galaxy  there  usually  exists  a supermassive  black  hole  with  a mass  that  is 
10(>  — 109  solar  masses  which  is  thought  to  have  played  an  important  role  in  formation  of  the  galaxy. 


Figure  16.7:  Einstein’s  Cross 
comprises  four  images  of  a dis- 
tant quasar  imaged  by  a closer 
galaxy  acting  as  a gravitational 
lens.  (Recorded  by  the  ESA  Faint 
Object  Camera  using  the  NASA 
Hubble  telescope.) 


16.9  Implications  of  relativistic  theory  to  classical  mechanics 

Einstein’s  theories  of  relativity  have  had  an  enormous  impact  on  twentieth  century  physics  and  the  philosophy 
of  science.  Relativistic  mechanics  is  crucial  to  an  understanding  of  the  physics  of  the  atom,  nucleus  and  the 
substructure  of  the  nucleons,  but  the  impacts  are  minimal  in  everyday  experience.  As  a consequence  the 
enormous  philosophical  implications  of  Einstein’s  theories  of  relativity  may  not  be  as  readily  apparent  as 
other  major  developments  during  the  20th  century.  In  spite  of  this,  it  is  important  to  be  cognizant  of 
the  consequences  of  these  theories  of  nature.  The  Special  Theory  of  Relativity  replaces  Newton’s  Laws 
of  motion;  i.e.  Newton’s  law  is  only  an  approximation  applicable  for  low  velocities.  The  General  Theory 
of  Relativity  replaces  Newton’s  Law  of  Gravitation  and  provides  a natural  explanation  of  the  equivalence 
principle.  Einstein’s  theories  of  relativity  imply  a profound  and  fundamental  change  in  the  view  of  the 
separation  of  space,  time,  and  mass,  that  contradicts  the  basic  tenets  that  are  the  foundation  of  Newtonian 
mechanics.  The  Newtonian  concepts  of  absolute  frame  of  reference,  plus  the  separation  of  space,  time, 
and  mass,  are  invalid  at  high  velocities.  Lagrangian  and  Hamiltonian  variational  approaches  to  classical 
mechanics  provide  the  formalism  necessary  for  handling  relativistic  mechanics.  The  present  chapter  has 
shown  that  logical  extensions  of  Lagrangian  and  Hamiltonian  mechanics  lead  to  the  relativistically-invariant 
extended  Lagrangian  and  Hamiltonian  formulations  of  mechanics  which  is  adequate  for  handling  one-body 
systems  within  the  Special  Theory  of  Relativity.  However,  major  unsolved  problems  remain  applying  these 
formulations  to  systems  having  more  than  one  body. 


16.10.  SUMMARY 
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16.10  Summary 

Special  theory  of  relativity:  The  Special  Theory  of  Relativity  is  based  on  Einstein’s  postulates; 

1)  The  laws  of  nature  are  the  same  in  all  inertial  frames  of  reference. 

2)  The  velocity  of  light  in  vacuum  is  the  same  in  all  inertial  frames  of  reference. 

For  a primed  frame  moving  along  the  x\  axis  with  velocity  v Einstein’s  postulates  imply  the  following 
Lorentz  transformations  between  the  moving  (primed)  and  stationary  (unprimed)  frames 

x'  = 7 (x  — vt)  x = 7 (xr  + vt') 

y'  = y y = y' 

z'  = z z — z' 

= t = 1(t'  + s£) 

where  the  Lorentz  7 factor  7 = 

Lorentz  transformations  were  used  to  illustrate  Lorentz  contraction,  time  dilation,  and  simultaneity.  An 
elementary  review  was  given  of  relativistic  kinematics  including  discussion  of  velocity  transformation,  linear 
momentum,  center-of-momentum  frame,  forces  and  energy. 

Geometry  of  space-time:  The  concepts  of  four-dimensional  space-time  were  introduced.  A discussion  of 
four- vector  scalar  products  introduced  the  use  of  contravariant  and  covariant  tensors  plus  the  Minkowski  met- 
ric g where  the  scalar  product  was  defined.  The  Minkowski  representation  of  space  time  and  the  momentum- 
energy  four  vector  also  were  introduced. 

Lorentz-invariant  formulation  of  Lagrangian  mechanics:  The  Lorentz-invariant  extended  La- 

grangian  formalism,  developed  by  Struckmeier[Str08],  based  on  the  parametric  approach  pioneered  by 
Lanczos[La49],  provides  a viable  Lorentz-invariant  extension  of  conventional  Lagrangian  mechanics  that 
is  applicable  for  one-body  motion  in  the  realm  of  the  Special  Theory  of  Relativity. 

Lorentz-invariant  formulation  of  Hamiltonian  mechanics:  The  Lorentz-invariant  extended  Hamil- 

tonian formalism,  developed  by  Struckmeier  based  on  the  parametric  approach  pioneered  by  Lanczos,  was 
introduced.  R was  shown  to  provide  a viable  Lorentz-invariant  extension  of  conventional  Hamiltonian  me- 
chanics that  is  applicable  for  one-body  motion  in  the  realm  of  the  Special  Theory  of  Relativity.  In  particu- 
lar, it  was  shown  that  the  Lorentz-invariant  extended  Hamiltonian  is  conserved  making  it  ideally  suited  for 
solving  complicated  systems  using  Hamiltonian  mechanics  via  use  of  the  Poisson-bracket  representation  of 
Hamiltonian  mechanics,  canonical  transformations,  and  the  Hamilton- Jacobi  techniques. 

The  General  Theory  of  Relativity:  An  elementary  summary  was  given  of  the  fundamental  concepts 

of  the  General  Theory  of  Relativity  and  the  resultant  unified  description  of  the  gravitational  force  plus 
planetary  motion  as  geodesic  motion  in  a four-dimensional  Riemannian  structure.  Variational  mechanics 
was  shown  to  be  ideally  suited  to  applications  of  the  General  Theory  of  Relativity. 

Philosophical  implications:  Newton’s  equations  of  motion,  and  his  Law  of  Gravitation,  that  reigned 
supreme  from  1687  to  1905,  have  been  toppled  from  the  throne  by  Einstein’s  theories  of  relativistic  me- 
chanics. By  contrast,  the  complete  independence  to  coordinate  frames  in  Lagrangian,  and  Hamiltonian 
formulations  of  classical  mechanics,  and  the  underlying  Principle  of  Least  Action,  are  equally  valid  in  both 
the  relativistic  and  non-relativistic  regimes.  As  a consequence,  relativistic  Lagrangian  and  Hamiltonian 
formulations  underlie  much  of  modern  physics,  especially  quantum  physics,  which  explains  why  relativistic 
mechanics  is  so  important  to  classical  dynamics.6 

' Recommended  reading: 

” Mr.  Tompkins  in  Paperback”  by  George  Gamow.  An  excellent  elementary  description  of  the  implications  of  the  Theory  of 
Relativity 

"Gravity:  An  Introduction  to  Einstein’s  General  Relativity"  by  James  B.  Hartle,  Addison  Wesley  (2003). 

"Classical  Mechanics  and  Relativity"  by  H.J.W.  Miiller-Kirsten,  World  Scientific,  Singapore  (2008). 
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Workshop  exercises 

1.  A relativistic  snake  of  proper  length  100cm  is  travelling  to  the  right  across  a butcher’s  table  at  V = 0.6c.  You 
hold  two  meat  cleavers,  one  in  each  hand  which  are  100cm  apart.  You  strike  the  table  simultaneously  with 
both  cleavers  at  the  moment  when  the  left  cleaver  lands  just  behind  the  tail  of  the  snake.  You  rationalize  that 
since  the  snake  is  moving  with  /3  = 0.6,  then  the  length  of  the  snake  is  Lorentz  contracted  by  the  factor  7 = f 
and  thus  the  Lorentz-contracted  length  of  the  snake  is  80cm  and  thus  will  not  be  harmed.  However,  the  snake 
reasons  that  relative  to  it  the  cleavers  are  moving  at  (3  = 0.6  and  thus  are  only  80cm  apart  when  they  strike 
the  100cm  long  snake  and  thus  it  will  be  severed.  Use  the  Lorentz  transformation  to  resolve  this  paradox. 

2.  Explain  what  is  meant  by  the  following  statement:  “Lorentz  transformations  are  orthogonal  transformations 
in  Minkowski  space.” 

3.  Which  of  the  following  are  invariant  quantities  in  space-time? 

(a)  Energy 

(b)  Momentum 

(c)  Mass 

(d)  Force 

(e)  Charge 

(f)  The  length  of  a vector 

(g)  The  length  of  a four-vector 

4.  What  does  it  mean  for  two  events  to  have  a spacelike  interval?  What  does  it  mean  for  them  to  have  a timelike 
interval?  Draw  a picture  to  support  your  answer.  In  which  case  can  events  be  causally  connected? 

Problems 

1.  A supply  rocket  flies  past  two  markers  on  the  Space  Station  that  are  50m  apart  in  a time  of  0.2 /is  as  measured 
by  an  observer  on  the  Space  station. 

(a)  What  is  the  separation  of  the  two  markers  as  seen  by  the  pilot  riding  in  the  supply  rocket? 

(b)  What  is  the  elapsed  time  as  measured  by  the  pilot  in  the  supply  rocket? 

(c)  What  are  the  speeds  calculated  by  the  observer  in  the  Space  Station  and  the  pilot  of  the  supply  rocket? 

2.  The  Compton  effect  involves  a photon  of  incident  energy  Ei  being  scattered  by  an  electron  of  mass  me  which 
initially  is  stationary.  The  photon  scattered  at  an  angle  9 with  respect  to  the  incident  photon  has  a final  energy 
Ef.  Using  the  special  theory  of  relativity  derive  a formula  that  related  Ef  and  Ei  to  9. 

3.  Pair  creation  involves  production  of  an  electron-positron  pair  by  a photon.  Show  that  such  a process  is 
impossible  unless  some  other  body,  such  as  a nucleus,  is  involved.  Suppose  that  the  nucleus  has  a mass  M 
and  the  electron  mass  me.  What  is  the  minimum  energy  that  the  photon  must  have  in  order  to  produce  an 
electron-positron  pair? 

4.  A K meson  of  rest  energy  AQAAIeV  decays  into  a fx  meson  of  rest  energy  106 MeV  and  a neutrino  of  zero 
rest  energy.  Find  the  kinetic  energies  of  the  \x  meson  and  the  neutrino  into  which  the  K meson  decays  while 
at  rest. 


Chapter  17 


The  transition  to  quantum  physics 


17.1  Introduction 

Classical  mechanics,  including  extensions  to  relativistic  velocities,  embrace  an  unusually  broad  range  of  topics 
ranging  from  astrophysics  to  nuclear  and  particle  physics,  from  one-bocly  to  many-body  statistical  mechanics. 
It  is  interesting  to  discuss  the  role  of  classical  mechanics  in  the  development  of  quantum  mechanics  which 
plays  a crucial  role  in  physics.  A valid  question  is  "why  discuss  quantum  mechanics  in  a classical  mechanics 
course?".  The  answer  is  that  quantum  mechanics  supersedes  classical  mechanics  as  the  fundamental  the- 
ory of  mechanics.  Classical  mechanics  is  an  approximation  applicable  for  situations  where  quantization  is 
unimportant.  Thus  there  must  be  a correspondence  principle  that  relates  quantum  mechanics  to  classical 
mechanics,  analogous  to  the  relation  between  relativistic  and  non-relativistic  mechanics.  It  is  illuminating  to 
study  the  role  played  by  the  Hamiltonian  formulation  of  classical  mechanics  in  the  development  of  quantal 
theory  and  statistical  mechanics.  The  Hamiltonian  formulation  is  expressed  in  terms  of  the  phase-space 
variables  q,  p for  which  there  are  well-established  rules  for  transforming  to  quantal  linear  operators. 


17.2  Brief  summary  of  the  origins  of  quantum  theory 

The  last  decade  of  the  19th  century  saw  the  culmination  of  classical  physics.  By  1900  scientists  thought 
that  the  basic  laws  of  mechanics,  electromagnetism,  and  statistical  mechanics  were  understood  and  worried 
that  future  physics  would  be  reduced  to  confirming  theories  to  the  fifth  decimal  place,  with  few  major  new 
discoveries  to  be  made.  However,  technical  developments  such  as  photography,  vacuum  pumps,  induction 
coil,  etc.,  led  to  important  discoveries  that  revolutionized  physics  and  toppled  classical  mechanics  from  its 
throne  at  the  beginning  of  the  20th  century.  Table  17.1  summarizes  some  of  the  major  milestones  leading 
up  to  the  development  of  quantum  mechanics. 

Max  Planck  searched  for  an  explanation  of  the  spectral  shape  of  the  black-body  electromagnetic  radia- 
tion. He  found  an  interpolation  between  two  conflicting  theories,  one  that  reproduced  the  short  wavelength 
behavior,  and  the  other  the  long  wavelength  behavior.  Planck’s  interpolation  required  assuming  that  electro- 
magnetic radiation  was  not  emitted  with  a continuous  range  of  energies,  but  that  electromagnetic  radiation 
is  emitted  in  discrete  bundles  of  energy  called  quanta.  In  December  1900  he  presented  his  theory  which 
reproduced  precisely  the  measured  black  body  spectral  distribution  by  assuming  that  the  energy  carried  by 
a single  quantum  must  be  an  integer  multiple  of  hv: 


E = hv=  — (17.1) 

A 

where  v is  the  frequency  of  the  electromagnetic  radiation  and  Planck’s  constant,  h = 6.62610-34  J • sec  was 
the  best  fit  parameter  of  the  interpolation.  That  is,  Planck  assumed  that  energy  comes  in  discrete  bundles 
of  energy  equal  to  hv  which  are  called  quanta.  By  making  this  extreme  assumption,  in  an  act  of  desperation, 
Planck  was  able  to  reproduce  the  experimental  black  body  radiation  spectrum.  The  assumption  that  energy 
was  exchanged  in  bundles  hinted  that  the  classical  laws  of  physics  were  inadequate  in  the  microscopic 
domain.  The  older  generation  physicists  initially  refused  to  believe  Planck’s  hypothesis  which  underlies 
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quantum  theory.  It  was  the  new  generation  physicists,  like  Einstein,  Bohr,  Heisenberg,  Born,  Schrodinger, 
and  Dirac,  who  developed  Planck’s  hypothesis  leading  to  the  revolutionary  quantum  theory. 

In  1905,  Einstein  predicted  the  existence  of  the  photon,  derived  the  theory  of  specific  heat,  as  well 
as  deriving  the  Theory  of  Special  Relativity.  It  is  remarkable  to  realize  that  he  developed  these  three 
revolutionary  theories  in  one  year,  when  he  was  only  26  years  old.  Einstein  uncovered  an  inconsistency  in 
Planck’s  derivation  of  the  black  body  spectral  distribution  in  that  it  assumed  the  statistical  part  of  the  energy 
is  quantized,  whereas  the  electromagnetic  radiation  assumed  Maxwell’s  equations  with  oscillator  energies 
being  continuous.  Planck  demanded  that  light  of  frequency  v be  packaged  in  quanta  whose  energies  were 
multiples  of  hv,  but  Planck  never  thought  that  light  would  have  particle-like  behavior.  Newton  believed  that 
light  involved  corpuscles,  and  Hamilton  developed  the  Hamilton- Jacobi  theory  seeking  to  describe  light  in 
terms  of  the  corpuscle  theory.  However,  Maxwell  had  convinced  physicists  that  light  was  a wave  phenomena; 
interference  plus  diffraction  effects  were  convincing  manifestations  of  the  wave- like  properties  of  light.  In 
order  to  reproduce  Planck’s  prediction,  Einstein  had  to  treat  black-body  radiation  as  if  it  consisted  of  a gas 
of  photons,  each  photon  having  energy  E = hv.  This  was  a revolutionary  concept  that  returned  to  Newton’s 
corpuscle  theory  of  light.  Einstein  realized  that  there  were  direct  tests  of  his  photon  hypothesis,  one  of  which 
is  the  photo-electric  effect.  According  to  Einstein,  each  photon  has  an  energy  E = hv , in  contrast  to  the 
classical  case  where  the  energy  of  the  photoelectron  depends  on  the  intensity  of  the  light.  Einstein  predicted 
that  the  ejected  electron  will  have  a kinetic  energy 

KE  = hv  — W (17.2) 

where  W is  the  work  function  which  is  the  energy  needed  to  remove  an  electron  from  a solid. 

Many  older  scientists,  including  Planck,  accepted  Einstein’s  theory  of  relativity  but  were  skeptical  of 
the  photon  concept,  even  after  Einstein’s  theory  was  vindicated  in  1915  by  Millikan  who  showed  that,  as 
predicted,  the  energy  of  the  ejected  photoelectron  depended  on  the  frequency,  and  not  intensity,  of  the  light. 
In  1923  Compton’s  demonstrated  that  electromagnetic  radiation  scattered  by  free  electrons  obeyed  simple 
two-body  scattering  laws  which  finally  convinced  the  many  skeptics  of  the  existence  of  the  photon. 


Table  17.1:  Chronology  of  the  development  of  quantum  mechanics 


Date 

Author 

Development 

1887 

Hertz 

Discovered  the  photo-electric  effect 

1895 

Rontgen 

Discovered  x-rays 

1896 

Becquerel 

Discovered  radioactivity 

1897 

J.J.  Thomson 

Discovered  the  first  fundamental  particle,  the  electron 

1898 

Pierre  & Marie  Curie 

Showed  that  thorium  is  radioactive  which  founded  nuclear  physics 

1900 

Planck 

Quantization  E = hv  explained  the  black-body  spectrum 

1905 

Einstein 

Theory  of  special  relativity 

1905 

Einstein 

Predicted  the  existence  of  the  photon 

1906 

Einstein 

Used  Planck’s  constant  to  explain  specific  heats  of  solids 

1909 

Millikan 

The  oil  drop  experiment  measured  the  charge  on  the  electron 

1911 

Rutherford 

Discovered  the  atomic  nucleus  with  radius  10-it>m 

1912 

Bohr 

Bohr  model  of  the  atom  explained  the  quantized  states  of  hydrogen 

1914 

Moseley 

X-ray  spectra  determined  the  atomic  number  of  the  elements. 

1915 

Millikan 

Used  the  photo-electric  effect  to  confirm  the  photon  hypothesis. 

1915 

Wilson-  S onnnerfeld 

Proposed  quantization  of  the  action-angle  integral 

1921 

Stern-  Gerlach 

Observed  space  quantization  in  non-uniform  magnetic  field 

1923 

Compton 

Compton  scattering  of  x-rays  confirmed  the  photon  hypothesis 

1924 

de  Broglie 

Postulated  wave-particle  duality  for  matter  and  EM  waves 

1924 

Bohr 

Explicit  statement  of  the  correspondence  principle 

1925 

Pauli 

Postulated  the  exclusion  principle 

1925 

Goudsmit-Uhlenbeck 

Postulated  the  spin  of  the  electron  of  s = ^h 

1925 

Heisenberg 

Matrix  mechanics  representation  of  quantum  theory 

1925 

Dirac 

Related  Poisson  brackets  and  commutation  relations 

1926 

Schrodinger 

Wave  mechanics 

1927 

G.P.  Thomson/Davisson 

Electron  diffraction  proved  wave  nature  of  electron 

1928 

Dirac 

Developed  the  Dirac  relativistic  wave  equation 

17.2.  BRIEF  SUMMARY  OF  THE  ORIGINS  OF  QUANTUM  THEORY 
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17.2. 1 Bohr  model  of  the  atom 

The  Rutherford  scattering  experiment,  performed  at  Manchester  in  1911,  discovered  that  the  Au  atom 
comprised  a positively  charge  nucleus  of  radius  « 10-14m  which  is  much  smaller  than  the  1.35  x 10~10m 
radius  of  the  Au  atom.  Stimulated  by  this  discovery,  Niels  Bohr  joined  Rutherford  at  Manchester  in  1912 
where  he  developed  the  Bohr  model  of  the  atom.  This  theory  was  remarkably  successful  in  spite  of  having 
serious  inconsistencies  and  deficiencies.  Bohr’s  model  assumptions  were: 

1)  Electromagnetic  radiation  is  quantized  with  E = hv. 

2)  Electromagnetic  radiation  exhibits  behavior  characteristic  of  the  emission  of  photons  with  energy 
E = hv  and  momentum  p = — . That  is,  it  exhibits  both  wave- like  and  particle-like  behavior. 

3)  Electrons  are  in  stationary  orbits  that  do  not  radiate,  which  contradicts  the  predictions  of  classical 
electromagnetism. 

4)  The  orbits  are  quantized  such  that  the  electron  angular  momentum  is  an  integer  multiple  of  ^ = h. 

5)  Atomic  electromagnetic  radiation  is  emitted  with  photon  energy  equal  to  the  difference  in  binding 
energy  between  the  two  atomic  levels  involved,  hv  = E\  — E-2 

The  first  two  assumptions  are  due  to  Planck  and  Einstein,  while  the  last  three  were  made  by  Niels  Bohr. 

The  deficiencies  of  the  Bohr  model  were  the  philosophical  problems  of  violating  the  tenets  of  classical 
physics  in  explaining  hydrogen-like  atoms,  that  is,  the  theory  was  prescriptive,  not  deductive.  The  Bohr 
model  was  based  implicitly  on  the  assumption  that  quantum  theory  contains  classical  mechanics  as  a limiting 
case.  Bohr  explicitly  stated  this  assumption  which  he  called  the  correspondence  principle,  and  which 
played  a pivotal  role  in  the  development  of  the  older  quantum  theory.  In  1924  Bohr  justified  the  inconsis- 
tencies of  the  old  quantum  theory  by  writing  "As  frequently  emphasized,  these  principles,  although  they 
are  formulated  by  the  help  of  classical  conceptions,  are  to  be  regarded  purely  as  laws  of  quantum  theory, 
which  give  us,  not  withstanding  the  formal  nature  of  quantum  theory,  a hope  in  the  future  of  a consistent 
theory,  which  at  the  same  time  reproduces  the  characteristic  features  of  quantum  theory,  important  for  its 
applicability,  and,  nevertheless,  can  be  regarded  as  a rational  generalization  of  classical  electrodynamics." 

The  old  quantum  theory  was  remarkably  successful  in  reproducing  the  black-body  spectrum,  specific  heats 
of  solids,  the  hydrogen  atom,  and  the  periodic  table  of  the  elements.  Unfortunately,  from  a methodological 
point  of  view,  the  theory  was  a hodgepodge  of  hypotheses,  principles,  theorems,  and  computational  recipes, 
rather  than  a logical  consistent  theory.  Every  problem  was  first  solved  in  terms  of  classical  mechanics, 
and  then  would  pass  through  a mysterious  quantization  procedure  involving  the  correspondence  principle. 
Although  built  on  the  foundation  of  classical  mechanics,  it  required  Bohr’s  hypotheses  which  violated  the 
laws  of  classical  mechanics  and  predictions  of  Maxwell’s  equations. 

17.2.2  Quantization 

By  1912  Planck,  and  others,  had  abandoned  the  concept  that  quantum  theory  was  a branch  of  classical 
mechanics,  and  were  searching  to  see  if  classical  mechanics  was  a special  case  of  a more  general  quantum 
physics,  or  quantum  physics  was  a science  altogether  outside  of  classical  mechanics.  Also  they  were  trying 
to  find  a consistent  and  rational  reason  for  quantization  to  replace  the  ad  hoc  assumption  of  Bohr. 

In  1912  Sommerfeld  proposed  that,  in  every  elementary  process,  the  atom  gains  or  loses  a definite  amount 
of  action  between  times  to  and  t of 

S=  [ L(t')dt'  (17.3) 

Jto 

where  S is  the  quantal  analogue  of  the  classical  action  function.  It  has  been  shown  that  the  classical  principle 
of  least  action  states  that  the  action  function  is  stationary  for  small  variations  of  the  trajectory.  In  1915 
Wilson  and  Sommerfeld  recognized  that  the  quantization  of  angular  momentum  could  be  expressed  in  terms 
of  the  action-angle  integral,  that  is  equation  14.116.  They  postulated  that,  for  every  coordinate,  the  action- 
angle  variable  is  quantized 

Pkdqk  = nh  (17.4) 

where  the  action-angle  variable  integral  is  over  one  complete  period  of  the  motion.  That  is,  they  postulated 
that  Hamilton’s  phase  space  is  quantized,  but  the  microscopic  granularity  is  such  that  the  quantization  is 
only  manifest  for  atomic-sized  domains.  That  is,  n is  a small  integer  for  atomic  systems  in  contrast  to 
n « 10G4  for  the  Earth-Sun  two-body  system. 
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Sommerfeld  recognized  that  quantization  of  more  than  one  degree  of  freedom  is  needed  to  obtain  more 
accurate  description  of  the  hydrogen  atom.  Sommerfeld  reproduced  the  experimental  data  by  assuming 
quantization  of  the  three  degrees  of  freedom, 


prdr  = mh 


ri2h 


j)p<l>d<t>  = n3h 


(17.5) 


and  solving  Hamilton- Jacobi  theory  by  separation  of  variables.  In  1916  the  Bohr-Sommerfeld  model  solved 
the  classical  orbits  for  the  hydrogen  atom,  including  relativistic  corrections  as  described  in  example  16.7. 
This  reproduced  fine  structure  observed  in  the  optical  spectra  of  hydrogen.  The  use  of  the  canonical  trans- 
formation to  action-angle  variables  proved  to  be  the  ideal  approach  for  solving  many  such  problems  in 
quantum  mechanics.  In  1921,  Stern  and  Gerlach  demonstrated  space  quantization  by  observing  the  splitting 
of  atomic  beams  deflected  by  non-uniform  magnetic  fields.  This  result  was  a major  triumph  for  quantum 
theory.  Sommerfeld  declared  that  "With  their  bold  experimental  method,  Stern  and  Gerlach  demonstrated 
not  only  the  existence  of  space  quantization,  they  also  proved  the  atomic  nature  of  the  magnetic  moment, 
its  quantum-theoretic  origin,  and  its  relation  to  the  atomic  structure  of  electricity." 

In  1925,  Pauli’s  Exclusion  Principle  proposed  that  no  more  than  one  electron  can  have  identical  quantum 
numbers  and  that  the  atomic  electronic  state  is  specified  by  four  quantum  numbers.  Two  students,  Goudsmit 
and  Uhlenbeck  suggested  that  a fourth  two- valued  quantum  number  was  the  electron  spin  of  ±|.  This 
provided  an  explanation  for  the  structure  of  multi-electron  atoms. 


17.2.3  Wave-particle  duality 


In  his  1924  doctoral  thesis,  Prince  Louis  de  Broglie  proposed  the  hypothesis  of  wave-particle  duality  which 
was  a pivotal  development  in  quantum  theory,  de  Broglie  used  the  classical  concept  of  a matter  wavepacket, 
analogous  to  classical  wave  packets  discussed  in  chapter  3.11.  He  assumed  that  both  the  group  and  signal 
velocities  of  a matter  wave  packet  must  equal  the  velocity  of  the  corresponding  particle.  By  analogy  with 
Einstein’s  relation  for  the  photon,  and  using  the  Theory  of  Special  Relativity,  de  Broglie  assumed  that 


hu>  = E = 


me 


vWS> 

The  group  velocity  is  required  to  equal  the  velocity  of  the  mass  m 


die 

dk 


did 

dv 


= v 


This  gives 


Integration  of  this  equation  assuming  that  k = 0 when  v = 0,  then  gives 

mv 


dk  1 7 did \ (m\  ( i y2 
dv  v \dv ) V h ) V c2 


/ik  = 


= P 


1 


(17.6) 


(17.7) 


(17.8) 


(17.9) 


This  relation,  derived  by  de  Broglie,  is  required  to  ensure  that  the  particle  travels  at  the  group  velocity 
of  the  wave  packet  characterizing  the  particle.  Note  that  although  the  relations  used  to  characterize  the 
matter  waves  are  purely  classical,  the  physical  content  of  such  waves  is  beyond  classical  physics.  In  1927  C. 
Davisson  and  G.P.  Thomson  independently  observed  electron  diffraction  confirming  wave/particle  duality 
for  the  electron.  Ironically,  J.J.  Thomson  discovered  that  the  electron  was  a particle,  while  his  son  attributed 
it  to  an  electron  wave. 

Heisenberg  developed  the  modern  matrix  formulation  of  quantum  theory  in  1925;  he  was  24  years  old 
at  the  time.  A few  months  later  Schrodinger’s  developed  wave  mechanics  based  on  de  Broglie’s  concept  of 
wave-particle  duality.  The  matrix  mechanics,  and  wave  mechanics,  quantum  theories  are  radically  different. 
Heisenberg’s  algebraic  approach  employs  non-commuting  quantities  and  unfamiliar  mathematical  techniques 
that  emphasized  the  discreteness  characteristic  of  the  corpuscle  aspect.  In  contrast,  Schrodinger  used  the 
familiar  analytical  approach  that  is  an  extension  of  classical  laws  of  motion  and  waves  which  stressed  the 
element  of  continuity. 
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17.3  Hamiltonian  in  quantum  theory 

17.3.1  Heisenberg’s  matrix-mechanics  representation 

The  algebraic  Heisenberg  representation  of  quantum  theory  is  analogous  to  the  algebraic  Hamiltonian  rep- 
resentation of  classical  mechanics,  and  shows  best  how  quantum  theory  evolved  from,  and  is  related  to, 
classical  mechanics.  Heisenberg  decided  to  ignore  the  prevailing  conceptual  theories,  such  as  classical  me- 
chanics, and  based  his  quantum  theory  on  observables.  This  approach  was  influenced  by  the  success  of 
Bohr’s  older  quantum  theory  and  Einstein’s  theory  of  relativity.  He  abandoned  the  classical  notions  that 
the  canonical  variables  pk.  qk  can  be  measured  directly  and  simultaneously.  Secondly  he  wished  to  absorb  the 
correspondence  principle  directly  into  the  theory  instead  of  it  being  an  ad  hoc  procedure  tailored  to  each  ap- 
plication. Heisenberg  considered  the  Fourier  decomposition  of  transition  amplitudes  between  discrete  states 
and  found  that  the  product  of  the  conjugate  variables  do  not  commute.  Heisenberg  derived,  for  the  first 
time,  the  correct  energy  levels  of  the  one-dimensional  harmonic  oscillator  as  En  = Hu>(n  + which  was  a 
significant  achievement.  Born  recognized  that  Heisenberg’s  strange  multiplication  and  commutation  rules  for 
two  variables,  corresponded  to  matrix  algebra.  Prior  to  1925,  matrix  algebra  was  an  obscure  branch  of  pure 
mathematics  not  known  or  used  by  the  physics  community.  Heisenberg,  Born,  and  the  young  mathemati- 
cian Jordan,  developed  the  commutation  rules  of  matrix  mechanics.  Heisenberg’s  approach  represents  the 
classical  position  and  momentum  coordinates  q,p  by  matrices  q and  p,  with  corresponding  matrix  elements 
and  Pmn€lUmnt  ■ Born  showed  that  the  trace  of  the  matrix 

H{pq)  =pq-L  (17.10) 

gives  the  Hamiltonian  function  H{ p,  q)  of  the  matrices  q and  p which  leads  to  Hamilton’s  canonical  equations 

dH  dH 

q=^  P=~W 

Heisenberg  and  Born  also  showed  that  the  commutator  of  q,  p equals 

qkPi  — POk  = ihdki 
qkqi  - qiqk  = 0 

PkPi  - PiPk  = o 

Born  realized  that  equation  (17.12)  is  the  only  fundamental  equation  for  introducing  li  into  the  theory  in  a 
logical  and  consistent  way. 

Chapter  14.2.4  discussed  the  formal  correspondence  between  the  Poisson  bracket,  defined  in  chapter  14.3, 
and  the  commutator  in  classical  mechanics.  It  was  shown  that  the  commutator  of  two  functions  equals  a 
constant  multiplicative  factor  A times  the  corresponding  Poisson  Bracket.  That  is 


(17.11) 

(17.12) 


(FjGk  - GkFj)  = A [Fj,Gk] 


(17.13) 


where  the  multiplicative  factor  A is  a number  independent  of  Fj,Gk,  and  the  commutator. 

In  1925,  Paul  Dirac,  a 23-year  old  graduate  student  at  Bristol,  recognized  the  crucial  importance  of 
the  above  correspondence  between  the  commutator  and  the  Poisson  Bracket  of  two  functions,  to  relating 
classical  mechanics  and  quantum  mechanics.  Dirac  noted  that  if  the  constant  A is  assigned  the  value  A = ih, 
then  equation  17.13  directly  relates  Heisenberg’s  commutation  relations  between  the  fundamental  canonical 
variables  ( qj,pk ) to  the  corresponding  classical  Poisson  Bracket  [qj,Pk]-  That  is, 

qkPi  — Piqk  = ih[qk,Pi]=ih5ki  (17.14) 

qkqi-qiqk  = ih[qk,qi}  = o (17.15) 

PkPi-PiPk  = ih[pk,Pi}=  0 (17.16) 

Dirac  recognized  that  the  correspondence  between  the  classical  Poisson  bracket,  and  quantum  commu- 
tator, in  equation  (17.13)  provides  a logical  and  consistent  way  that  builds  quantization  directly  into  the 
theory,  rather  than  using  an  ad-hoc,  case-dependent,  hypothesis  as  used  by  the  older  quantum  theory  of 
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Bohr.  The  basis  of  Dirac’s  quantization  principle,  involves  replacing  the  classical  Poisson  Bracket,  [Fj,Gk\ 
by  the  commutator,  J ( Fj,Gk  — GkFj).  That  is, 


[Fj,Gk]  =>  1 (FjGk  - GkFi)  (17.17) 

Hamilton’s  canonical  equations,  as  introduced  in  chapter  14,  are  only  applicable  to  classical  mechanics 
since  they  assume  that  the  exact  position  and  conjugate  momentum  can  be  specified  both  exactly  and 
simultaneously  which  contradicts  the  Heisenberg’s  Uncertainty  Principle.  In  contrast,  the  Poisson  bracket 
generalization  of  Hamilton’s  equations  allows  for  non-commuting  variables  plus  the  corresponding  uncertainty 
principle.  That  is,  the  transformation  from  classical  mechanics  to  quantum  mechanics  can  be  accomplished 
simply  by  replacing  the  classical  Poisson  Bracket  by  the  quantum  commutator,  as  proposed  by  Dirac.  The 
formal  analogy  between  classical  Hamiltonian  mechanics,  and  the  Heisenberg  representation  of  quantum 
mechanics  is  strikingly  apparent  using  the  correspondence  between  the  Poisson  Bracket  representation  of 
Hamiltonian  mechanics  and  Heisenberg’s  matrix  mechanics. 

The  direct  relation  between  the  quantum  commutator,  and  the  corresponding  classical  Poisson  Bracket, 
can  applied  to  many  observables.  For  example,  the  quantum  analogs  of  Hamilton’s  equations  of  motion 
are  given  by  use  of  Hamilton’s  equations  of  motion,  14.53, 14.56,  and  replacing  each  Poisson  Bracket  by  the 
corresponding  commutator.  That  is 

t = § = = <1718> 

t = &**-*«>  (ins) 


Chapter  14.2.5  discussed  the  time  dependence  of  observables  in  Hamiltonian  mechanics.  Equation  14.45 
gave  the  total  time  derivative  of  any  observable  G to  be 


dG 

dt 


(17.20) 


Equation  17.17  can  be  used  to  replace  the  Poisson  Bracket  by  the  quantum  commutator,  which  gives  the 
corresponding  time  dependence  of  observables  in  quantum  physics. 


f = w + k{GH~HG)  <1721> 

In  quantum  mechanics,  equation  17.21  is  called  the  Heisenberg  equation.  Note  that  if  the  observable  G is 
chosen  to  be  a fundamental  canonical  variable,  then  = 0 = and  equation  14.20  reduces  to  Hamilton’s 
equations  17.18  and  17.19. 

The  analogies  between  classical  mechanics  and  quantum  mechanics  extend  further.  For  example,  if  G is 
a constant  of  motion,  that  is  (jf  = 0,  then  Heisenberg’s  equation  of  motion  gives 

FiC  I 

— + -{GH-HG)  = 0 (17.22) 

at  in 

Moreover,  if  G is  not  an  explicit  function  of  time,  then 

0=4  (GH  - HG)  (17.23) 

in 

That  is,  the  transition  to  quantum  physics  shows  that,  if  G is  a constant  of  motion,  and  is  not  explicitly 
time  dependent,  then  G commutes  with  the  Hamiltonian  H. 

The  above  discussion  has  illustrated  the  close  and  beautiful  correspondence  between  the  Poisson  Bracket 
representation  of  classical  Hamiltonian  mechanics,  and  the  Heisenberg  representation  of  quantum  mechanics. 
Dirac  provided  the  elegant  and  simple  correspondence  principle  connecting  the  Poisson  bracket  representation 
of  classical  Hamiltonian  mechanics,  to  the  Heisenberg  representation  of  quantum  mechanics. 


1 7.3.  HAMILTONIAN  IN  QUANTUM  THEORY 


489 


17.3.2  Schrodinger’s  wave- mechanics  representation 

The  wave  mechanics  formulation  of  quantum  mechanics,  by  the  Austrian  theorist  Schrodinger,  was  built  on 
the  wave-particle  duality  concept  that  was  proposed  in  1924  by  Louis  de  Broglie.  Schrodinger  developed 
his  wave  mechanics  representation  of  quantum  physics  a year  after  the  development  of  matrix  mechanics 
by  Heisenberg  and  Born.  The  Schrodinger  wave  equation  is  based  on  the  non- relativistic  Hamilton- Jacobi 
representation  of  a wave  equation,  melded  with  the  operator  formalism  of  Born  and  Wiener.  The  39-year  old 
Schrodinger  was  an  expert  in  classical  mechanics  and  wave  theory,  which  was  invaluable  when  he  developed 
the  important  Schrodinger  equation.  As  mentioned  in  chapter  14.4.4,  the  Hamilton- Jacobi  theory  is  a 
formalism  of  classical  mechanics  that  allows  the  motion  of  a particle  to  be  represented  by  a wave.  That  is, 
the  wavefronts  are  surfaces  of  constant  action  S,  and  the  particle  momenta  are  normal  to  these  constant- 
action  surfaces,  that  is,  p = 'VS.  The  wave-particle  duality  of  Hamilton- Jacobi  theory  is  a natural  way  to 
handle  the  wave-particle  duality  proposed  by  de  Broglie. 

Consider  the  classical  Hamilton- Jacobi  equation  for  one  body,  given  by  13.20. 


ac 

— +ff(q,VS,t)=0 

If  the  Hamiltonian  is  time  independent,  then  equation  14.91  gives  that 


(17.24) 


— = -tf(q,p,f)  = -E(a) 


(17.25) 


The  integration  of  the  time  dependence  is  trivial,  and  thus  the  action  integral  for  a time-independent  Hamil- 
tonian is 

<S(q,  a,t)  = W (q,  a)  — E (a)  t (17.26) 


A formal  transformation  gives 


p = VS 


(17.27) 


Consider  that  the  classical  time-independent  Hamiltonian,  for  motion  of  a single  particle,  is  represented 
by  the  Hamilton- Jacobi  equation. 


H=b+u^=-ft 

Substitute  for  p leads  to  the  classical  Hamilton- Jacobi  relation  in  terms  of  the  action  S 

-(VS.  VS)  + £/(,)  = -w 


(17.28) 


(17.29) 


By  analogy  with  the  Hamilton-Jacobi  equation,  Schrodinger  proposed  the  quantum  operator  equation 


= Hip 
at 


(17.30) 


where  H is  an  operator  given  by 


iu2+u{r) 


(17.31) 


In  1926,  Max  Born  and  Norbert  Wiener  introduced  the  operator  formalism  into  matrix  mechanics  for  predic- 
tion of  observables  and  this  has  become  an  integral  part  of  quantum  theory.  In  the  operator  formalism,  the 
observables  are  represented  by  operators  that  project  the  corresponding  observable  from  the  wavefunction. 
That  is,  the  quantum  operator  formalism  for  the  assumed  momentum  and  energy  operators,  that  operate 
on  the  wavefunction  ip,  are 

h d ^ h d 

Pi  = E=~7H7  (17.32) 


Formal  transformations  of  p and  E in  the  Hamiltonian  (17.26)  leads  to  the  time-independent  Schrodinger 
equation 

-tw+vm=E*  (17'33) 
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Assume  that  the  wavefunction  is  of  the  form 


ip  = Ae  b 


(17.34) 


where  the  action  S gives  the  phase  of  the  wavefront,  and  A the  amplitude  of  the  wave,  as  described  in 
chapter  14.4.4.  The  time  dependence,  that  characterizes  the  motion  of  the  wavefront,  is  contained  in  the 
time  dependence  of  S.  This  form  for  the  wavefunction  has  the  advantage  that  the  wavefunction  frequently 
factors  into  a product  of  terms,  e.g.  ip  = R(r)Q(9)$>(<p)  which  corresponds  to  a summation  of  the  exponents 
S = Wr  + We  + Wf/)  — Et.  This  summation  form  is  exploited  by  separation  of  the  variables,  as  discussed  in 
chapter  14.4.3. 

Insert  ip  (17.33)  into  equation  (17.28) , plus  using  the  fact  that 


d2ip  _ d 7dipdS\  _ d fi  dS\  _ 1 / dS\2  i d2S 

[dSd^J  ~ dq  \ Jt'IJ~dq  ) ~ \~07i  ) + TiTh? 


(17.35) 


leads  to 

f)Q  1 jk 

-a=^(vs'vs)  + f,<‘l)-^vs  = Ji  <17'36> 

Note  that  if  Planck’s  constant  h = 0,  then  the  imaginary  term  in  equation  (17.35)  is  zero,  leading  to  17.35 
being  real,  and  identical  to  the  Hamilton- Jacobi  result,  equation  17.23.  The  fact  that  equation  17.35 
equals  the  Hamilton-Jacobi  equation  in  the  limit  h — > 0,  illustrates  the  close  analogy  between  the  wave- 
particle  duality  of  the  classical  Hamilton-Jacobi  theory,  and  de  Broglie’s  wave-particle  duality  in  Schrodinger’s 
quantum  wave-mechanics  representation. 

The  Schrodinger  approach  was  rapidly  adopted  in  1925  and  exploited  extensively  with  tremendous  success, 
since  it  is  much  easier  to  grasp  conceptually,  than  is  the  algebraic  approach  of  Heisenberg.  Initially  there 
was  much  conflict  between  the  proponents  of  these  two  contradictory  approaches,  but  this  was  resolved  by 
Schrodinger  who  showed  in  1926  that  there  is  a formal  mathematical  identity  between  wave  mechanics  and 
matrix  mechanics.  That  is,  these  quantal  two  representations  of  Hamiltonian  mechanics  are  equivalent,  even 
though  they  are  built  on  either  the  Poisson  bracket  representation,  or  the  Hamilton-Jacobi  representation. 
Wave  mechanics  is  based  intimately  on  the  quantization  rule  of  the  action  variable.  Heisenberg’s  Uncertainty 
Principle  is  automatically  satisfied  by  Schrodinger’s  wave  mechanics  since  the  uncertainty  principle  is  a 
feature  of  all  wave  motion,  as  described  in  chapter  3. 

In  1928  Dirac  developed  a relativistic  wave  equation  which  includes  spin  as  an  integral  part.  This  Dirac 
equation  remains  the  fundamental  wave  equation  of  quantum  mechanics.  Unfortunately  it  is  difficult  to 
apply. 

Today  the  powerful  and  efficient  Heisenberg  representation  is  the  dominant  approach  used  in  the  field  of 
physics,  whereas  chemists  tend  to  prefer  the  more  intuitive  Schrodinger  wave  mechanics  approach.  In  either 
case,  the  important  role  of  Hamiltonian  mechanics  in  quantum  theory  is  undeniable. 


17.4  Lagrangian  representation  in  quantum  theory 

The  classical  notion  of  canonical  coordinates  and  momenta,  has  a simple  quantum  analog  which  has  al- 
lowed the  Hamiltonian  theory  of  classical  mechanics,  that  is  based  on  canonical  coordinates,  to  serve  as  the 
foundation  for  the  development  of  quantum  mechanics.  The  alternative  Lagrangian  formulation  for  classical 
dynamics  is  described  in  terms  of  coordinates  and  velocities,  instead  of  coordinates  and  momenta.  The  La- 
grangian and  Hamiltonian  formulations  are  closely  related,  and  it  may  appear  that  the  Lagrangian  approach 
is  more  fundamental.  The  Lagrangian  method  allows  collecting  together  all  the  equations  of  motion  and 
expressing  them  as  stationary  properties  of  the  action  integral,  and  thus  it  may  appear  desirable  to  base 
quantum  mechanics  on  the  Lagrangian  theory  of  classical  mechanics.  Unfortunately,  the  Lagrangian  equa- 
tions of  motion  involve  partial  derivatives  with  respect  to  coordinates,  and  their  velocities,  and  the  meaning 
ascribed  to  such  derivatives  is  difficult  in  quantum  mechanics.  The  close  correspondence  between  Poisson 
brackets  and  the  commutation  rules  leads  naturally  to  Hamiltonian  mechanics.  However,  Dirac  showed  that 
Lagrangian  mechanics  can  be  carried  over  to  quantum  mechanics  using  canonical  transformations  such  that 
the  classical  Lagrangian  is  considered  to  be  a function  of  coordinates  at  time  t and  t + dt  rather  than  of 
coordinates  and  velocities. 


17.5.  CORRESPONDENCE  PRINCIPLE 
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The  motivation  for  Feynman’s  1942  Ph.D  thesis,  entitled  " The  Principle  of  Least  Action  in  Quantum 
Mechanics" , was  to  quantize  the  classical  action  at  a distance  in  electrodynamics.  This  theory  adopted  an 
overall  space-time  viewpoint  for  which  the  classical  Hamiltonian  approach,  as  used  in  conventional  formu- 
lations of  quantum  mechanics,  is  inapplicable.  Feynman  used  the  Lagrangian,  plus  the  principle  of  least 
action,  to  underlie  his  development  of  quantum  field  theory.  To  paraphrase  Feynman’s  Nobel  Lecture,  he 
used  a physical  approach  that  is  quite  different  from  the  customary  Hamiltonian  point  of  view  for  which  the 
system  is  discussed  in  great  detail  as  a function  of  time.  That  is,  you  have  the  field  at  this  moment,  then  a 
differential  equation  gives  you  the  field  at  a later  moment  and  so  on;  that  is,  the  Hamiltonian  approach  is  a 
time  differential  method.  In  Feynman’s  least-action  approach  the  action  describes  the  character  of  the  path 
throughout  all  of  space  and  time.  The  behavior  of  nature  is  determined  by  saying  that  the  whole  space-time 
path  has  a certain  character.  The  use  of  action  involves  both  advanced  and  retarded  terms  that  make  it 
difficult  to  transform  back  to  the  Hamiltonian  form.  The  Feynman  space-time  approach  is  far  beyond  the 
scope  of  this  course.  This  topic  will  be  developed  in  advanced  graduate  courses  on  quantum  field  theory. 

17.5  Correspondence  Principle 

The  Correspondence  Principle  implies  that  any  new  theory  in  physics  must  reduce  to  preceding  theories 
that  have  been  proven  to  be  valid.  For  example,  Einstein’s  Special  Theory  of  Relativity  satisfies  the  Corre- 
spondence Principle  since  it  reduces  to  classical  mechanics  for  velocities  small  compared  with  the  velocity 
of  light.  Similarly,  the  General  Theory  of  Relativity  reduces  to  Newton’s  Law  of  Gravitation  in  the  limit 
of  weak  gravitational  fields.  Bohr’s  Correspondence  Principle  requires  that  the  predictions  of  quantum  me- 
chanics must  reproduce  the  predictions  of  classical  physics  in  the  limit  of  large  quantum  numbers.  Bohr’s 
Correspondence  Principle  played  a pivotal  role  in  the  development  of  the  old  quantum  theory,  from  it’s 
inception  in  1912,  until  1925  when  the  old  quantum  theory  was  superseded  by  the  current  matrix  and  wave 
mechanics  representations  of  quantum  mechanics. 

Quantum  theory  now  is  a well-established  field  of  physics  that  is  equally  as  fundamental  as  is  classical 
mechanics.  The  Correspondence  Principle  now  is  used  to  project  out  the  analogous  classical-mechanics 
phenomena  that  underlie  the  observed  properties  of  quantal  systems.  For  example,  this  book  has  studied 
the  classical-mechanics  analogs  of  the  observed  behavior  for  typical  quantal  systems,  such  as  the  vibrational 
and  rotational  modes  of  the  molecule,  and  the  vibrational  modes  of  the  crystalline  lattice.  The  nucleus  is  the 
epitome  of  a many-body,  strongly-interacting,  quantal  system.  Example  12.12  showed  that  there  is  a close 
correspondence  between  classical-mechanics  predictions,  and  quantal  predictions,  for  both  the  rotational  and 
vibrational  collective  modes  of  the  nucleus,  as  well  as  for  the  single-particle  motion  of  the  nucleons  in  the 
nuclear  mean  held,  such  as  the  onset  of  Coriolis-induced  alignment.  This  use  of  the  Correspondence  Principle 
can  provide  considerable  insight  into  the  underlying  classical  physics  embedded  in  quantal  systems. 
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17.6  Summary 

The  important  point  of  this  discussion  is  that  variational  formulations  of  classical  mechanics  provide  a 
rational,  and  direct  basis,  for  the  development  of  quantum  mechanics.  It  has  been  shown  that  the  final  form 
of  quantum  mechanics  is  closely  related  to  the  Hamiltonian  formulation  of  classical  mechanics.  Quantum 
mechanics  supersedes  classical  mechanics  as  the  fundamental  theory  of  mechanics  in  that  classical  mechanics 
only  applies  for  situations  where  quantization  is  unimportant,  and  is  the  limiting  case  of  quantum  mechanics 
when  h — > 0,  which  is  in  agreement  with  the  Bohr’s  Correspondence  Principle.  The  Dirac  relativistic  theory 
of  quantum  mechanics  is  the  ultimate  quantal  theory  for  the  relativistic  regime. 

This  discussion  has  barely  scratched  the  surface  of  the  correspondence  between  classical  and  quantal 
mechanics,  which  goes  far  beyond  the  scope  of  this  course.  The  goal  of  this  chapter  is  to  illustrate  that 
classical  mechanics,  in  particular,  Hamiltonian  mechanics,  underlies  much  of  what  you  will  learn  in  your 
quantum  physics  courses.  An  interesting  similarity  between  quantum  mechanics  and  classical  mechanics  is 
that  physicists  usually  use  the  more  visual  Schrodinger  wave  representation  in  order  to  describe  quantum 
physics  to  the  non-expert,  which  is  analogous  to  the  similar  use  of  Newtonian  physics  in  classical  mechan- 
ics. However,  practicing  physicists  invariably  use  the  more  abstract  Heisenberg  matrix  mechanics  to  solve 
problems  in  quantum  mechanics,  analogous  to  widespread  use  of  the  variational  approach  in  classical  me- 
chanics, because  the  analytical  approaches  are  more  powerful  and  have  fundamental  advantages.  Quantal 
problems  in  molecular,  atomic,  nuclear,  and  subnuclear  systems,  usually  involve  finding  the  normal  modes 
of  a quantal  system,  that  is,  finding  the  eigen-energies,  eigen-functions,  spin,  parity,  and  other  observables 
for  the  discrete  quantized  levels.  Solving  the  equations  of  motion  for  the  modes  of  quantal  systems  is  sim- 
ilar to  solving  the  many-body  coupled-oscillator  problem  in  classical  mechanics,  where  it  was  shown  that 
use  of  matrix  mechanics  is  the  most  powerful  representation.  It  is  ironic  that  the  introduction  of  matrix 
methods  to  classical  mechanics  is  a by-product  of  the  development  of  matrix  mechanics  by  Heisenberg,  Born 
and  Jordan.  This  illustrates  that  classical  mechanics  not  only  played  a pivotal  role  in  the  development  of 
quantum  mechanics,  but  it  also  has  benefitted  considerably  from  the  development  of  quantum  mechanics; 
that  is,  the  synergistic  relation  between  these  two  complementary  branches  of  physics  has  been  beneficial  to 
both  classical  and  quantum  mechanics. 

Recommended  reading 

"Quantum  Mechanics"  by  P.A.M.  Dirac,  Oxford  Press,  1947, 

"Conceptual  Development  of  Quantum  Mechanics"  by  Max  Jammer,  Me  Graw  Hill  1966. 


Chapter  18 
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This  book  has  introduced  powerful  analytical  methods  based  on  variational  principles  that  play  a pivotal 
role  in  classical  dynamics,  as  well  as  in  many  modern  branches  of  science  and  engineering.  The  prologue 
showed  a road  map  of  the  pathways  in  advanced  classical  mechanics  that  have  been  explored  in  order  to 
introduce  the  reader  to  sophisticated  and  powerful  new  approaches  to  problem  solving  in  science.  In  spite  of 
the  considerable  amount  of  material  covered,  there  are  major  topics  that  had  to  be  omitted,  or  mentioned 
superficially. 

This  long  and  arduous  study  of  classical  mechanics  has  elucidated  the  remarkable  developments,  plus 
their  philosophical  implications,  implied  by  use  of  variational  formulations  in  classical  mechanics.  This 
approach  was  pioneered  by  Leibniz,  Lagrange,  Euler,  Hamilton  and  Jacobi  during  the  remarkable  Age  of 
Enlightenment,  and  finally  reached  full  fruition  at  the  start  of  the  20th  century.  Philosophically,  Newtonian 
mechanics  is  straightforward  in  that  it  uses  differential  equations  of  motion  that  relate  the  instantaneous 
forces  with  the  instantaneous  accelerations,  while  the  concepts  of  momentum  and  force  are  intuitive  to 
visualize  and  both  cause  and  effect  are  embedded  in  Newtonian  mechanics.  However,  Newtonian  mechanics  is 
incompatible  with  the  relativistic  concept  of  space-time,  it  is  unable  to  correctly  predict  relativistic  mechanics, 
and  it  fails  to  provide  the  unified  description  of  the  gravitational  force  plus  planetary  motion  as  geodesic 
motion  in  a four-dimensional  Riemannian  structure. 

The  philosophical  implications  embedded  in  applying  variational  principles  to  mechanics  are  remarkable. 
The  applicability  of  variational  principles  is  based  on  the  astonishing  fact  that  motion  of  a constrained 
system  in  nature  follows  a path  that  minimizes  the  action  integral.  As  a consequence,  solving  the  equations 
of  motion  is  reduced  to  finding  the  optimum  path  that  minimizes  the  action  integral.  The  fact  that  nature 
follows  optimization  principles  is  nonintuitive,  and  was  considered  to  be  metaphysical  by  many  scientists 
and  philosophers  which  delayed  full  acceptance  of  analytical  mechanics  until  the  development  of  the  Theory 
of  Relativity.  Variational  formulations  now  are  the  preeminent  approach  to  classical  mechanics  and  modern 
physics;  they  have  toppled  Newtonian  mechanics  from  the  throne  of  classical  mechanics  that  it  occupied  for 
two  centuries.  The  importance  of  the  variational  approach  to  science  and  engineering  justifies  the  trials  and 
tribulations  endured  learning  this  powerful  approach. 

This  book  has  gone  beyond  the  normal  syllabus  to  glimpse  how  Lagrangian  and  Hamiltonian  dynamics 
provide  the  foundation  upon  which  modern  physics  is  built.  It  has  illustrated  that  a solid  foundation  in 
analytical  mechanics  is  essential  for  the  study  of  modern  physics.  The  techniques  and  physics  discussed  in 
this  book  reappear  in  new  guises  in  many  other  courses,  but  the  basic  physics  is  unchanged.  The  fundamen- 
tal developments  and  applications  of  variational  principles  in  classical  mechanics  illustrate  the  intellectual 
beauty,  the  tremendous  philosophical  implications,  and  the  unity  of  the  field  of  physics.  The  enormous 
breadth  of  physics  addressed  by  classical  mechanics,  and  the  underlying  unity  of  the  field,  is  epitomized 
by  the  wide  range  of  dimensions  and  complexity  involved.  The  dimensions  range  from  as  large  as  1027m, 
which  is  the  current  lower  bound  for  the  size  of  the  universe  derived  from  the  Planck  spacecraft,  to  quantal 
analogues  of  classical  mechanics  of  systems  spanning  in  size  down  to  the  Planck  length  of  1.62  x 10“35m. 
In  complexity,  classical  mechanics  spans  from  one  body  to  the  statistical  mechanics  of  nrany-body  systems. 
Analytical  variational  methods  have  become  the  premier  approach  to  describe  systems  from  the  very  largest 
to  the  smallest,  and  from  one-body  to  nrany-body  dynamical  systems. 

This  book  has  illustrated  the  astonishingly  power  of  analytical  variational  methods  for  understanding  the 
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physics  underlying  classical  mechanics  and  many  branches  of  modern  physics.  However,  the  present  narrative 
remains  unfinished  in  that  fundamental  philosophical  and  technical  questions  remain  to  be  solved  in  classical 
mechanics.  For  example,  analytical  mechanics  is  based  on  the  validity  of  the  assumed  principle  of  economy. 
This  book  has  not  addressed  the  philosophical  question,  "is  the  principle  of  economy  a fundamental  law  of 
nature,  or  is  it  a fortuitous  consequence  of  the  fundamental  laws  of  nature?" 


Appendix  A 


Matrix  algebra 


A.l  Mathematical  methods  for  mechanics 

Development  of  classical  mechanics  has  involved  a close  and  synergistic  interweaving  of  physics  and  mathe- 
matics, that  continues  to  play  a key  role  in  these  fields.  The  concepts  of  scalar  and  vector  fields  play  a pivotal 
role  in  describing  the  force  fields  and  particle  motion  in  both  the  Newtonian  formulation  of  classical  mechan- 
ics and  electromagnetism.  Thus  it  is  imperative  that  you  be  familiar  with  the  sophisticated  mathematical 
formalism  used  to  treat  multivariate  scalar  and  vector  fields  in  classical  mechanics.  Ordinary  and  partial 
differential  equations  up  to  second  order,  as  well  as  integration  of  algebraic  and  trigonometric  functions  play 
a major  role  in  classical  mechanics.  It  is  assumed  that  you  already  have  a working  knowledge  of  differential 
and  integral  calculus  in  sufficient  depth  to  handle  this  material.  Computer  codes,  such  as  Mathematica, 
MatLab,  and  Maple,  or  symbolic  calculators,  can  be  used  to  obtain  mathematical  solutions  for  complicated 
cases. 

The  following  9 appendices  provide  brief  summaries  of  matrix  algebra,  vector  algebra,  orthogonal  co- 
ordinate systems,  coordinate  transformations,  tensor  algebra,  multivariate  calculus,  vector  differential  plus 
integral  calculus,  Fourier  analysis  and  time-sampled  waveform  analysis.  The  manipulation  of  scalar  and 
vector  fields  is  greatly  facilitated  by  transforming  to  orthogonal  curvilinear  coordinate  systems  that  match 
the  symmetries  of  the  problem.  These  appendices  discuss  the  necessity  to  account  for  the  time  dependence 
of  the  orthogonal  unit  vectors  for  curvilinear  coordinate  systems.  It  is  assumed  that,  except  for  coordinate 
transformations  and  tensor  algebra,  you  have  been  introduced  to  these  topics  in  linear  algebra  and  other 
physics  courses,  and  thus  the  purpose  of  these  appendices  is  to  serve  as  a reference  and  brief  review. 


A.  2 Matrices 

Matrix  algebra  provides  an  elegant  and  powerful  representation  of  multivariate  operators,  and  coordinate 
transformations  that  feature  prominently  in  classical  mechanics.  For  example  they  play  a pivotal  role  in 
finding  the  eigenvalues  and  eigenfunctions  for  coupled  equations  that  occur  in  rigid-body  rotation,  and 
coupled  oscillator  systems.  An  understanding  of  the  role  of  matrix  mechanics  in  classical  mechanics  facilitates 
understanding  of  the  equally  important  role  played  by  matrix  mechanics  in  quantal  physics. 

It  is  interesting  that  although  determinants  were  used  by  physicists  in  the  late  19th  century,  the  concept 
of  matrix  algebra  was  developed  by  Arthur  Cayley  in  England  in  1855,  but  many  of  these  ideas  were  the  work 
of  Hamilton,  and  the  discussion  of  matrix  algebra  was  buried  in  a more  general  discussion  of  determinants. 
Matrix  algebra  was  an  esoteric  branch  of  mathematics,  little  known  by  the  physics  community,  until  1925 
when  Heisenberg  proposed  his  innovative  new  quantum  theory.  The  striking  feature  of  this  new  theory 
was  its  representation  of  physical  quantities  by  sets  of  time-dependent  complex  numbers  and  a peculiar 
multiplication  rule.  Max  Born  recognized  that  Heisenberg’s  multiplication  rule  is  just  the  standard  "row 
times  column"  multiplication  rule  of  matrix  algebra;  a topic  that  he  had  encountered  as  a young  student  in  a 
mathematics  course.  In  1924  Richard  Courant  had  just  completed  the  first  volume  of  the  new  text  Methods 
of  Mathematical  Physics  during  which  Pascual  Jordan  had  served  as  his  young  assistant  working  on  matrix 
manipulation.  Fortuitously,  Jordan  and  Born  happened  to  share  a carriage  on  a train  to  Hanover  during 
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which  Jordan  overheard  Born  talk  about  his  problems  trying  to  work  with  matrices.  Jordan  introduced 
himself  to  Born  and  offered  to  help.  This  led  to  publication,  in  September  1925,  of  the  famous  Born-Jordan 
paper  [Bor25a]  that  gave  the  first  rigorous  formulation  of  matrix  mechanics  in  physics.  This  was  followed  in 
November  by  the  Born-Heisenberg-Jordan  sequel  [Bor25b]  that  established  a logical  consistent  general  method 
for  solving  matrix  mechanics  problems  plus  a connection  between  the  mathematics  of  matrix  mechanics  and 
linear  algebra.  Matrix  algebra  developed  into  an  important  tool  in  mathematics  and  physics  during  World 
War  2 and  now  it  is  an  integral  part  of  undergraduate  linear  algebra  courses. 

Most  applications  of  matrix  algebra  in  this  book  are  restricted  to  real,  symmetric,  square  matrices.  The 
size  of  a matrix  is  defined  by  the  rank,  which  equals  the  row  rank  and  column  rank,  i.e.  the  number  of 
independent  row  vectors  or  column  vectors  in  the  square  matrix.  It  is  presumed  that  you  have  studied 
matrices  in  a linear  algebra  course.  Thus  the  goal  of  this  review  is  to  list  simple  manipulation  of  symmetric 
matrices  and  matrix  diagonalization  that  will  be  used  in  this  course.  You  are  referred  to  a linear  algebra 
textbook  if  you  need  further  details. 


Matrix  definition 


A matrix  is  a rectangular  array  of  numbers  with  M rows  and  N columns.  The  notation  used  for  an  element 
of  a matrix  is  Ai:j  where  i designates  the  row  and  j designates  the  column  of  this  matrix  element  in  the 
matrix  A.  Convention  denotes  a matrix  A as 


/ 

An 

A 12 

Ai(jv-i) 

Ain 

A21 

A22 

^2(JV-1) 

A2N 

A(m-  1)1 

Aij 

A(M—i)(N—i) 

•? 

.. 

A(M- 1)2 

V 

Ami 

Am2 

Am(n-i) 

Amn 

Matrices  can  be  square,  M = N , or  rectangular  M N.  Matrices  having  only  one  row  or  column  are 
called  row  or  column  vectors  respectively,  and  need  only  a single  subscript  label.  For  example, 


( 


A = 


A i 
^■2 


\ 


I Am-  i 

V Am  ) 


(A.2) 


Matrix  manipulation 

Matrices  are  defined  to  obey  certain  rules  for  matrix  manipulation  as  given  below. 

1)  Multiplication  of  a matrix  by  a scalar  A simply  multiplies  each  matrix  element  by  A. 

Qj  = A Aij  (A.  3) 

2)  Addition  of  two  matrices  A and  B having  the  same  rank,  i.e.  the  number  of  columns,  is  given  by 

Cij  = Aij  + Bij  (A.  4) 


3)  Multiplication  of  a matrix  A by  a matrix  B is  defined  only  if  the  number  of  columns  in  A equals  the 
number  of  rows  in  B.  The  product  matrix  C is  given  by  the  matrix  product 


C=  A B 


Cij  — [AB]-  — AikBkj 

k 

For  example,  if  both  A and  B are  rank  three  symmetric  matrices  then 


C 


l 

An 

A12 

A13  \ 

B - 

= 

A21 

A22 

A23  I 

V 

A31 

A32 

A33  / 

An 

Bn 

+ A12B21 

+ 

A13B31 

A21 

Bn 

+ A22B21 

+ 

A23B31 

A31 

Bn 

+ A32 

B21 

+ 

A33B31 

/ Bn 

B\2 

Biz  \ 

B21 

B22 

B23 

V B31 

B32 

B33  ) 

A11B12 

+ A12B22  + A13B32 

A21B12  + A22B22  + A23-B32 
^31^12  + AI32-B22  + A33B32 


A11B13 

A21B13 

AmB13 


(A.5) 

(A.6) 


A12B23  + A13B33  \ 
A22-B23  + A23B33 
A32B2S  + A33B33  J 
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In  general,  multiplication  of  matrices  A and  B is  noncommutative,  i.e. 

A’B/B-A  (A. 7) 

In  the  special  case  when  A ■ B = B ■ A then  the  matrices  are  said  to  commute. 

Transposed  matrix  A1 

The  transpose  of  a matrix  A will  be  denoted  by  AT  and  is  given  by  interchanging  rows  and  columns,  that  is 

{AT)ij  = Aji  (A.8) 

The  transpose  of  a column  vector  is  a row  vector.  Note  that  older  texts  use  the  symbol  A for  the  transpose. 

Identity  (unity)  matrix  I 

The  identity  (unity)  matrix  I is  diagonal  with  diagonal  elements  equal  to  1,  that  is 

Ijj  = 5ij  (A.  9) 

where  the  Kronecker  delta  symbol  is  defined  by 

Sik  = 0 if  i ^ k (A. 10) 

= 1 if  i = k 

Inverse  matrix  A 1 

If  a matrix  is  non-singular,  that  is,  its  determinant  is  non-zero,  then  it  is  possible  to  define  an  inverse  matrix 
A-1.  A square  matrix  has  an  inverse  matrix  for  which  the  product 

A • A-1  = I (A. 11) 

Orthogonal  matrix 

A matrix  with  real  elements  is  orthogonal  if 

At  = A-1  (A. 12) 

That  is  _ ^ 

(A  )j & Afcj  = AkiAkj  = 6ij  (A. 13) 

k k 

Adjoint  matrix  A^ 

For  a matrix  with  complex  elements , the  adjoint  matrix,  denoted  by  A^  is  defined  as  the  transpose  of  the 
complex  conjugate 

(A%=A*  (A-14) 

Hermitian  matrix 

The  Hermitian  conjugate  of  a complex  matrix  H is  denoted  as  IT  and  is  defined  as 

= (Ht)*  = (H*)t  (A. 15) 

Therefore 

h\3  = //;,  (A. 16) 

A matrix  is  Hermitian  if  it  is  equal  to  its  adjoint 

= H (A. 17) 

that  is 

K - //*;  //„  (A. 18) 

A matrix  that  is  both  Hermitian  and  has  real  elements  is  a symmetric  matrix  since  complex  conjugation  has 
no  effect. 
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Unitary  matrix 

A matrix  with  complex  elements  is  unitary  if  its  inverse  is  equal  to  the  adjoint  matrix 

Uf  = U^1  (A. 19) 

which  is  equivalent  to 

UfU  = I (A. 20) 


A unitary  matrix  with  real  elements  is  an  orthogonal  matrix  as  given  in  equation  A.  12. 


Trace  of  a square  matrix  Tr A 

The  trace  of  a square  matrix,  denoted  by  Tr  A,  is  defined  as  the  sum  of  the  diagonal  matrix  elements. 

N 

TrA=^2Au  (A. 21) 

j=i 


Inner  product  of  column  vectors 

Real  vectors  The  generalization  of  the  scalar  (dot)  product  in  Euclidean  space  is  called  the  inner  prod- 
uct. Exploiting  the  rules  of  matrix  multiplication  requires  taking  the  transpose  of  the  first  column  vector 
to  form  a row  vector  which  then  is  multiplied  by  the  second  column  vector  using  the  conventional  rules  for 
matrix  multiplication.  That  is,  for  rank  N vectors 


( X'  ^ 

( Yl  ) 

( Yl  \ 

X]-[Y]  = 

x2 

y2 

= [X]T  [Y]  = ( Xr  X2  ..  XN  ) 

r2 

l Xn  ) 

{ yn  ) 

{ Yn  ) 

N 


Y X<Y< 


(A. 22) 


For  rank  N = 3 this  inner  product  agrees  with  the  conventional  definition  of  the  scalar  product  and  gives  a 
result  that  is  a scalar.  For  the  special  case  when  [A]  • [B]  = 0 then  the  two  matrices  are  called  orthogonal. 
The  magnitude  squared  of  a column  vector  is  given  by  the  inner  product 


N 

[X]  • [X]  = 5]  (X,)2  > o 

i= 1 


(A. 23) 


Note  that  this  is  only  positive. 


Complex  vectors  For  vectors  having  complex  matrix  elements  the  inner  product  is  generalized  to  a form 
that  is  consistent  with  equation  A. 22  when  the  column  vector  matrix  elements  are  real. 


[X]*  • [Y]  = [X]f  [Y]  = ( Xf  X*2 


X 


* 

N—l 


x 


( \ 

^2 

YN-  ! 

V yn  / 


N 


YX*Y> 


(A. 24) 


For  the  special  case 


[X]*  • [X]  = [X]*  [X]  = Y^X*Xi  > 0 

i=l 


(A. 25) 
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A.  3 Determinants 

Definition 

The  determinant  of  a square  matrix  with  N rows  equals  a single  number  derived  using  the  matrix  elements 
of  the  matrix.  The  determinant  is  denoted  as  det  A or  |A|  where 

N 

|A|  = y^g(ji)  J2,  ■■■■jN)A1j1A2j2...ANjN  (A. 26) 

i=  i 

where  s(ji,  j2>  ■ ■■■Jn)  is  the  permutation  index  which  is  either  even  or  odd  depending  on  the  number  of 
permutations  required  to  go  from  the  normal  order  (1,  2,  3,  ...N)  to  the  sequence  (jijijs-jN)- 
For  example  for  N = 3 the  determinant  is 

|A|  = A11A22A33  + A12A23A31  + A33A21 A32  — A13A22A31  — A11A23A32  — 4I32  A21 A33  (A. 27) 

Properties 

1.  The  value  of  a determinant  \A\  = 0,  if 

(a)  all  elements  of  a row  (column)  are  zero. 

(b)  all  elements  of  a row  (column)  are  identical  with,  or  multiples  of,  the  corresponding  elements  of 
another  row  (column). 

2.  The  value  of  a determinant  is  unchanged  if 

(a)  rows  and  columns  are  interchanged. 

(b)  a linear  combination  of  any  number  of  rows  is  added  to  any  one  row. 

3.  The  value  of  a determinant  changes  sign  if  two  rows,  or  any  two  columns,  are  interchanged. 

4.  Transposing  a square  matrix  does  not  change  its  determinant.  |Ar|  = |A| 

5.  If  any  row  (column)  is  multiplied  by  a constant  factor  then  the  value  of  the  determinant  is  multiplied 
by  the  same  factor. 

6.  The  determinant  of  a diagonal  matrix  equals  the  product  of  the  diagonal  matrix  elements.  That  is, 
when  Aij  = A iSij  then  |A|  = A1A2A3...AJV 

7.  The  determinant  of  the  identity  (unity)  matrix  |I|  = 1. 

8.  The  determinant  of  the  null  matrix,  for  which  all  matrix  elements  are  zero,  1 0 j = 0 

9.  A singular  matrix  has  a determinant  equal  to  zero. 

10.  If  each  element  of  any  row  (column)  appears  as  the  sum  (difference)  of  two  or  more  quantities,  then 
the  determinant  can  be  written  as  a sum  (difference)  of  two  or  more  determinants  of  the  same  order. 
For  example  for  order  N = 2, 

An  ± Bn  A12  ± B12  _ An  A12  Bn  BV2 

A21  A22  A21  A22  A21  A22 

11  A determinant  of  a matrix  product  equals  the  product  of  the  determinants.  That  is,  if  C = AB  then 
|C|  = |A||B| 
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Cofactor  of  a square  matrix 

For  a square  matrix  having  N rows  the  cofactor  is  obtained  by  removing  the  ith  row  and  the  jth  column 
and  then  collapsing  the  remaining  matrix  elements  into  a square  matrix  with  N — 1 rows  while  preserving 
the  order  of  the  matrix  elements.  This  is  called  the  complementary  minor  which  is  denoted  as  A<l:>K  The 
matrix  elements  of  the  cofactor  square  matrix  a are  obtained  by  multiplying  the  determinant  of  the  (ij) 
complementary  minor  by  the  phase  factor  (— 1)!+J.  That  is 


aij 


(-1  Y+j 


A {ij) 


(A.28) 


The  cofactor  matrix  has  the  property  that 

JV  N 

~ ^ ij  |A|  ~ ^ ' AkjCikj  (A. 29) 

fc= 1 k= 1 

Cofactors  are  used  to  expand  the  determinant  of  a square  matrix  in  order  to  evaluate  the  determinant. 


Inverse  of  a non-singular  matrix 

The  (i,j)  matrix  elements  of  the  inverse  matrix  A-1  of  a non-singular  matrix  A are  given  by  the  ratio  of 
the  cofactor  a,-,  and  the  determinant  |A|,  that  is 

Aij  = i^j"  Oj*  (A. 30) 

Equations  A.28  and  A. 29  can  be  used  to  evaluate  the  i,  j element  of  the  matrix  product  (A_1A) 

N i N 1 

(A  XA)  „ = Aik  Akj  = t-^t  ^ djiAkj  = I A = dij  = hj  (A. 31) 

fc= i 1 1 fc= i 1 1 

This  agrees  with  equation  All  that  A ■ A-1  = I. 

The  inverse  of  rank  2 or  3 matrices  is  required  frequently  when  determining  the  eigen-solutions  for  rigid- 
body  rotation,  or  coupled  oscillator,  problems  in  classical  mechanics  as  described  in  chapters  11  and  12. 
Therefore  it  is  convenient  to  list  explicitly  the  inverse  matrices  for  both  rank  2 and  rank  3 matrices. 


Inverse  for  rank  2 matrices: 


a b 

-1  1 

d 

-b  ' 

i 

d 

-b  ' 

c d 

" A 

—c 

a 

(ad  — be) 

—c 

a 

where  the  determinant  of  A is  written  explicitly  in  equation  A32. 

Inverse  for  rank  3 matrices: 


A^1  = 


a 

b 

c 

-l 

1 

1 A 1 

' A 

B 

C ' 

T 

1 

1 A 1 

' A 

D 

G ' 

d 

e 

f 

D 

E 

F 

B 

E 

H 

. 9 

h 

i 

A 

G 

H 

I 

A 

C 

F 

I 

1 


aA  bB  cC 


A = (ei  — fh) 
B = -(di  - fg) 
C = ( dh  — eg) 


D = — (hi  — ch) 
E = (ai  — eg) 
F = — (ah  — bg) 


G = ( bf  — ce) 
H = — (af  — cd) 
I = (ae  — bd ) 


(A. 32) 


(A. 33) 


where  the  functions  A,  B , C , D , E,  F,  G , H , /,  are  equal  to  rank  2 determinants  listed  in  equation  A33. 
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A.4  Reduction  of  a matrix  to  diagonal  form 

Solving  coupled  linear  equations  can  be  reduced  to  diagonalization  of  a matrix.  Consider  the  matrix  A 
operating  on  the  vector  X to  produce  a vector  Y,  that  are  expressed  as  components  with  respect  to  the 
unprimed  coordinate  frame,  i.e. 

A X = Y (A. 34) 

Consider  that  the  unitary  real  matrix  R with  rank  n,  rotates  the  n-dimensional  un-primed  coordinate 
frame  into  the  primed  coordinate  frame  such  that  A , X and  Y are  transformed  to  A'  , X'  and  Y'  in  the 
rotated  primed  coordinate  frame.  Then 

X'  = R X 

Y'  = R Y (A. 35) 

With  respect  to  the  primed  coordinate  frame  equation  (A. 34)  becomes 

R-  (A  -X)  = R Y (A. 36) 

R A R 1 R X = R Y (A. 37) 

R A R’1  -X'  = A'  • X'  = Y'  (A. 38) 

using  the  fact  that  the  identity  matrix  I = R ■ R 1 = R ■ RT  since  the  rotation  matrix  in  n dimensions  is 
orthogonal. 

Thus  we  have  that  the  rotated  matrix 


A'  = R A Rt  (A. 39) 

Let  us  assume  that  this  transformed  matrix  is  diagonal,  then  it  can  be  written  as  the  product  of  the  unit 
matrix  I and  a vector  of  scalar  numbers  called  the  characteristic  roots  A as 


A'=  R ■ A • RT  = AI 

(A. 40) 

using  the  fact  that  RT=  R 1 then  gives 

Rt  • (AI)  = A'-Rt 

(A.41) 

Let  both  sides  of  equation  A. 41  act  on  X' 

which  gives 

AI-X'=  A'-X' 

(A. 42) 

or 

[AI— A']  X'=  0 

(A. 43) 

This  represents  a set  of  n homogeneous  linear  algebraic  equations  in  n unknowns  X'  where  A is  a set  of 
characteristic  roots,  (eigenvalues)  with  corresponding  eigenfunctions  X'.  Ignoring  the  trivial  case  of  X'  being 
zero,  then  (A. 43)  requires  that  the  secular  determinant  of  the  bracket  be  zero,  that  is 


| AI— A'|  = 0 (A. 44) 

The  determinant  can  be  expanded  and  factored  into  the  form 

(A  - AO  (A  - A2)  (A  - A3) ....  (A  - An)  = 0 (A.45) 

where  the  n eigenvalues  are  A = Ai,  A2,  ...A„  of  the  matrix  A'. 

The  eigenvectors  X'  corresponding  to  each  eigenvalue  are  determined  by  substituting  a given  eigenvalue 
A,;  into  the  relation 

X/T  • A'-X'=  [A;<y  (A.46) 

If  all  the  eigenvalues  are  distinct,  i.e.  different,  then  this  set  of  n equations  completely  determines  the  ratio 
of  the  components  of  each  eigenvector  along  the  axes  of  the  coordinate  frame.  However,  when  two  or  more 
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eigenvalues  are  identical,  then  the  reduction  to  a true  diagonal  form  is  not  possible  and  one  has  the  freedom 
to  select  an  appropriate  eigenvector  that  is  orthogonal  to  the  remaining  axes. 

In  summary,  the  matrix  can  only  be  fully  diagonalized  if  (a)  all  the  eigenvalues  are  distinct,  (b)  the  real 
matrix  is  symmetric,  (c)  it  is  unitary. 

A frequent  application  of  matrices  in  classical  mechanics  is  for  solving  a system  of  homogeneous  linear 
equations  of  the  form 

AnXi  +A12X2  +Ainxn 

A\\X\  +A 12X2  +Ainxn 


AniXi  A An  2X2  A Annxn 


= 0 

= 0 (A. 47) 

= 0 


Making  the  following  definitions 


( An  A12 

Aln  \ 

A21  A22 

A2  n 

(A. 48) 

\ -A-nl 

An2  ■■ 

A I 

• nnn  J 

( X1  \ 

x = 

X2 

(A. 49) 

\ xn  J 

Then  the  set  of  linear  equations  can  be  written  in  a compact  form  using  the  matrices 


A ■ X =0  (A. 50) 

which  can  be  solved  using  equation  (A. 43).  Ensure  that  you  are  able  to  diagonalize  a matrices  with  rank 
2 and  3.  You  can  use  Mathematica,  Maple,  MatLab,  or  other  such  mathematical  computer  programs  to 
diagonalize  larger  matrices. 


A.l  Example:  Eigenvalues  and  eigenvectors  of  a real  symmetric  matrix 


Consider  the  matrix 


A = 


0 1 0 \ 
10  0 
0 0 0 / 


The  secular  determinant  is  given  by  (A. 42) 


-A  1 0 

1 -A  0 
0 0 -A 


= 0 


This  expands  to 

— A(A  + 1)(A  — 1)  = 0 

Thus  the  three  eigen  values  are  A = — 1, 0, 1. 

To  find  each  eigenvectors  we  substitute  the  corresponding  eigenvalue  into  equation  (A. 48) . 


The  eigenvalue  A = — 1 yields  x A y = 0 and  z = 0.  Thus  the  eigen  vector  is  rq  = (-^=,  ^=,0).  The 
eigenvalue  A = 0 yields  x = 0 and  y = 0.  Thus  the  eigen  vector  is  r2  = (0,0, 1).  The  eigenvalue  A = 1 
yields  —x  A y = 0 and  z = 0.  Thus  the  eigen  vector  is  r$  = (-^7=,  ^,0).  The  orthogonality  of  these  three 
eigen  vectors,  which  correspond  to  three  distinct  eigenvalues,  can  be  verified. 
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A. 2 Example:  Degenerate  eigenvalues  of  real  symmetric  matrix 


This  example  illustrates  how  to  generate  eigenvectors  corresponding  to  degenerate  eigenvalues.  Consider 
the  matrix 

/ 1 0 0 

A = 0 0 1 

\ 0 1 0 

The  secular  determinant  is  given  by  (A.42) 


1 - A 0 0 

0 -A  1 
0 1 -A 


= 0 


This  expands  to 

(1  — A)  (A  + 1)(A  — 1)  = 0 
Thus  the  three  eigen  values  are  A = — 1,1,1. 

The  eigenvectors  are  determined  by  substituting  the  corresponding  eigenvalue  into  equation  (A. 42). 


The  eigenvalue  A = — 1 yields  2x  = 0 and  y + z = 0.  Thus  the  eigen  vector  is  r\  = (0,  -^=).  The 

eigenvalue  A = 1 yields  — y + z = 0.  The  eigenvector  r 2 must  be  perpendicidar  to  n and  there  are  an  infinite 
number  of  choices.  Let  us  assume  that  r2  = (0,  ^=,  ^=)  which  satisfies  equation  (A. 50)  then  the  eigenvector 
7-3  must  be  perpendicular  to  both  n and  r2.  For  rank  three  this  is  found  using 


r3  = ri  x r2  = (1,0,0) 
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Appendix  B 


Vector  algebra 


B.l  Linear  operations 

The  important  force  fields  in  classical  mechanics,  namely,  gravitation,  electric,  and  magnetic,  are  vector 
fields  that  have  a position-dependent  magnitude  and  direction.  Thus,  it  is  useful  to  summarize  the  algebra 
of  vector  fields. 

A vector  a has  both  a magnitude  |a|  and  a direction  defined  by  the  unit  vector  ea,  that  is,  the  vector 
can  be  written  as  a bold  character  a where 

a =a  ■ ea  (B.l) 

where  by  convention  the  implied  modulus  sign  is  omitted.  The  hat  symbol  on  the  vector  ea  designates  that 
this  is  a unit  vector  with  modulus  |ea|  = 1. 

Vector  force  fields  are  assumed  to  be  linear,  and  consequently  they  obey  the  principle  of  superposition, 
are  commutative,  associative,  and  distributive  as  illustrated  below  for  three  vectors  a,  b,  c plus  a scalar 
multiplier  7. 


a±b  = ±b  + a (B-2) 

a+(b  + c)  = (a  + b)  +c 
7 (a  + b)  = 7a+7b 

The  manipulation  of  vectors  is  greatly  facilitated  by  use  of  components  along  an  orthogonal  coordinate 
system  defined  by  three  orthogonal  unit  vectors  (ei,e2,e3)  . For  example  the  cartesian  coordinate  system 
is  defined  by  three  unit  vectors  which,  by  convention,  are  called  (i,  j,  k). 

B.2  Scalar  product 

Multiplication  of  two  vectors  can  produce  a 9— component  tensor  that  can  be  represented  by  a 3 x 3 matrix 
as  discussed  in  appendix  E.  There  are  two  special  cases  for  vector  multiplication  that  are  important  for 
vector  algebra;  the  first  is  the  scalar  product,  and  the  second  is  the  vector  product. 

The  scalar  product  of  two  vectors  is  defined  to  be 

a • b = |a|  |6|  cosd  (B.3) 

where  9 is  the  angle  between  the  two  vectors.  It  is  a scalar  and  thus  is  independent  of  the  orientation  of 
the  coordinate  axis  system.  Note  that  the  scalar  product  commutes,  is  distributive,  and  associative  with  a 
scalar  multiplier,  that  is 


a • b 
a-  (b  + c) 
(Aa)  b 


b • a 

a • b + a • c 
A (b  ■ a) 


Note  that  a • a = |o|  and  if  a and  b are  perpendicular  then  cos  9 = 0 and  thus  a • b =0 


(B.4) 


505 


506 


APPENDIX  B.  VECTOR  ALGEBRA 


If  the  three  unit  vectors  (ei,  §2,  §3)  form  an  orthonormal  basis,  that  is,  they  are  orthogonal  unit  vectors, 
then  from  equations  B. 3 and  BA 

e*  • ek  = Sik  (B.5) 

If  a is  the  unit  vector  for  the  vector  a then  the  scalar  product  of  a vector  a with  one  of  these  unit  vectors 
en  gives  the  cosine  of  the  angle  between  the  vector  a and  e„ , that  is 


a 

ei  = 

\a 

(a 

ei) 

= \a 

cos  a 

a 

e2  = 

\d 

(a 

g2) 

= l« 

cos/3 

a 

e3  = 

\a 

(a 

e3) 

= \a 

cos  7 

where  the  cosines  are  called  the  direction  cosines  since  they  define  the  direction  of  the  vector  a with  respect 
to  each  orthogonal  basis  unit  vector.  Moreover,  a • ei  = |o|  a • §1  = |a|  cos  a is  the  component  of  a along  the 
ei  axis.  Thus  the  three  components  of  the  vector  a is  fully  defined  by  the  magnitude  |a|  and  the  direction 
cosines,  corresponding  to  the  angles  a,  f3, 7.  That  is, 


cii  = 

|a 

(a 

ei)  = 

\a 

cos  a 

(B.7) 

a2  = 

|a 

(a 

e2)  = 

\a 

cos  /3 

«3  = 

|a 

(a 

e3)  = 

\a 

cos  7 

If  the  three  unit  vectors  (£1,62,(33)  form  an  orthonormal  basis  then  the  vector  is  fully  defined  by 


a — aiei  + CI2&2  + 0363 


(B.8) 


Consider  two  vectors 


a — UlGl  + CL  2&2  + (Z363 

b = 6iei  + boe-2  + b3e3 


Then  using  B.5 


a • b =a\b\  + a2b2  + a3b3  = |a|  |6|  cos  9 

where  9 is  the  angle  between  the  two  vectors.  In  particular,  since  the  direction  cosine  cosaa 
equation  B.  9 gives 


cos  9 = cos  aa  cos  ab  + cos  /3a  cos  /3b  + cos  ya  cos  7b 
Note  that  when  9 = 0 then  J5.10  gives 


(B.9) 
, then 


(B.10) 


cos2  a + cos2  /3  + cos2  7 = 1 


(B.ll) 


B.3  Vector  product 

The  vector  product  of  two  vectors  is  defined  to  be 

c = a x b = |a|  |6|  sin^n  (B.12) 

where  9 is  the  angle  between  the  vectors  and  n is  a unit  vector  perpendicular  to  the  plane  defined  by  a 
and  b such  that  the  unit  vectors  ^a,  b,  n^j  obey  a right-handed  screw  rule.  The  vector  product  acts  like  a 
pseudovector  which  comprises  a normal  vector  multiplied  by  a sign  factor  that  depends  on  the  handedness 
of  the  system  as  described  in  appendix  D.  3. 

The  components  of  c are  defined  by  the  relation 

G = ^ ] SjjkCijbk  (B.13) 

jk 

where  the  (Levi-Civita)  permutation  symbol  e^k  has  the  following  properties 

Sijk  = 0 if  an  index  is  equal  to  any  another  index 

£ijk  = +1  if  i.j,  k,  form  an  even  permutation  of  1, 2, 3 (B.14) 

£ijk  = — 1 if  i,j,  k,  form  an  odd  permutation  of  1,  2,  3 
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For  example,  if  the  three  unit  vectors  (ei,  e2,  #3)  form  an  orthonormal  basis,  then  e,;  = Y2jk  £ijk&j&k,  he. 


§1  x e2  = e3  e2  x e3  = ei  e3  x §1  = e2  (B.15) 

e2  x ei  = -e3  e3  x e2  = — e3  §1  x e3  = -e2  (B.16) 

e3  x §1  = 0 e2  x e2  = 0 e3  x eo  = 0 (B.17) 

The  vector  product  anticommutes  in  that 

axb=  b x a (B.18) 

However,  it  is  distributive  and  associative  with  a scalar  multiplier 

ax(b  + c)  = axb  + axc  (B.19) 

(Aa)xb  = A(axb)  (B.20) 


Note  that  when  sin0  = 0 then  a x b = 0 and  in  particular,  a x a = 0. 
Consider  two  vectors 


a — U161  + a2e2  + u3e3 

b = 6i§i  + b2e-2  + b3e3 


Then  using  equations  73.12  and  73.15  — 73.17 


a x b=  |a|  |6|  sin0  = 


e3  e2  e3 

ai  a2  a3 
b\  b2  b3 


= e3  ( a2b3  — a3b2)  + e2  (a36i  — ai63)  + e3  (a362  — a2bi) 


where  9 is  the  angle  between  the  two  vectors  and  the  determinant  is  evaluated  for  the  top  row.  Examples  of 
vector  products  are  torque  N = r x F,  angular  momentum  L = r x p,  and  the  magnetic  force  = gv  x B. 


B.4  Triple  products 

The  following  scalar  and  vector  triple  products  can  be  formed  from  the  product  of  three  vectors  and  are 
used  frequently. 


Scalar  triple  products 

There  are  several  permutations  of  scalar  triple  products  of  three  vectors  [a,  b,  c]  that  are  identical. 

a-  (b  x c)  = c-  (a  x b)  = b-  (c  x a)  = (a  x b)  • c = -a-  (c  x b)  (B.21) 

That  is,  the  scalar  product  is  invariant  to  cyclic  permutations  of  the  three  vectors  but  changes  sign  for 
interchange  of  two  vectors.  The  scalar  product  is  unchanged  by  swapping  the  scalar  (dot) and  vector  (cross). 
Because  of  the  symmetry  the  scalar  triple  product  can  be  denoted  as  [a,  b,  c]  and 

[a,  b,  c]  > 0 if  [a,  b,  c]  is  right-handed 

[a,  b,  c]  = 0 if  [a,  b,  c]  is  coplanar  (B.22) 

[a,  b,  c]  < 0 if  [a,  b,  c]  is  left-handed 


The  scalar  triple  product  can  be  written  in  terms  of  the  components  using  a determinant 


Ul 

a2 

a3 

bi 

b 2 

h 

Cl 

c2 

c3 

[a,  b,  c]  = 


(B.23) 
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Vector  triple  product 

The  vector  triple  product  ax  (b  x c)  is  a vector.  Since  (b  x c)  is  perpendicular  to  the  plane  of  b,  c,  then 
ax  (b  x c)  must  lie  in  the  plane  containing  b,  c.  Therefore  the  triple  product  can  be  expanded  in  terms  of 
b,  c,  as  given  by  the  following  identity 

ax  (b  x c)  = (a  • c)  b — (a  • b)  c (B.24) 


Workshop  exercises 


1.  Partition  the  following  exercises  among  the  group.  Once  you  have  completed  your  problem,  check  with  a 
classmate  before  writing  it  on  the  board.  After  you  have  verified  that  you  have  found  the  correct  solution, 
write  your  answer  in  the  space  provided  on  the  board,  taking  care  to  include  the  steps  that  you  used  to  arrive 
at  your  solution.  The  following  information  is  needed. 


a = 3i  + 2j  — 9k  b = — 2i  + 3k  c = 


Calculate  each  of  the  following 


1 |a  — (b  + 3c)  | 

2 Component  of  c along  a 

3 Angle  between  c and  d 

4 (b  x d) • a 

5 (b  x d)  x a 

6 bx  (d  x a) 


— 2i  + j — 6k  d = i + 9j  + 4k 


8 |HE| 

9 EHG 


10  EG  HG 

11  EH  HTEr 

12  F-1 


Problems 

[1]  For  what  values  of  a are  the  vectors  A = 2 ai  — 2 j + ak  and  B = ai  + 2 aj  + 2k  perpendicular? 


[2]  Show  that  the  triple  scalar  product  ( A X B)  ■ C can  be  written  as 


(A  x B)  • C = 


A\  A2  A3 

Bi  B‘2  B3 

C1  c2  c3 


Show  also  that  the  product  is  unaffected  by  interchange  of  the  scalar  and  vector  product  operations  or  by  change  in 


the  order  of  A,  B , C as  long  as  they  are  in  cyclic  order,  that  is 

(A  x B)  • C = A ■ (B  x C)  = B • (C  x A)  =(C  x A)  ■ B 

Therefore  we  may  use  the  notation  ABC  to  denote  the  triple  scalar  product.  Finally  give  a geometric  interpre- 
tation of  ABC  by  computing  the  volume  of  the  parallelepiped  defined  by  the  three  vectors  A,  B,  C. 


Appendix  C 

Orthogonal  coordinate  systems 


The  methods  of  vector  analysis  provide  a convenient  representation  of  physical  laws.  However,  the  manip- 
ulation of  scalar  and  vector  fields  is  greatly  facilitated  by  use  of  components  with  respect  to  an  orthogonal 
coordinate  system. 

C.l  Cartesian  coordinates  (x,y,z) 

Cartesian  coordinates  (rectangular)  provide  the  simplest  orthogonal  rectangular  coordinate  system.  The 
unit  vectors  specifying  the  direction  along  the  three  orthogonal  axes  are  taken  to  be  (i,j,k).  In  cartesian 
coordinates  scalar  and  vector  functions  are  written  as 

(f>  = cj)(x,  y,  z)  (C.l) 

r = xi+yS+zk  (C.2) 

Calculation  of  the  time  derivatives  of  the  position  vector  is  especially  simple  using  cartesian  coordinates 
because  the  unit  vectors  (i,j,k)  are  constant  and  independent  in  time.  That  is; 

d\  dj  dk 

dt  dt  dt 

Since  the  time  derivatives  of  the  unit  vectors  are  all  zero  then  the  velocity  r 
derivatives  of  x,  y,  and  2.  That  is, 

r =x\+y}Jrzk 

Similarly  the  acceleration  is  given  by 

r =x\+y]+zk 

C.2  Curvilinear  coordinate  systems 

There  are  many  examples  in  physics  where  the  symmetry  of  the  problem  makes  it  more  convenient  to  solve 
motion  at  a point  P(x,y,z)  using  non-cartesian  curvilinear  coordinate  systems.  For  example,  problems 
having  spherical  symmetry  are  most  conveniently  handled  using  a spherical  coordinate  system  (r,  9 , <j>) 
with  the  origin  at  the  center  of  spherical  symmetry.  Such  problems  occur  frequently  in  electrostatics  and 
gravitation;  e.g.  solutions  of  the  atom,  or  planetary  systems.  Note  that  a cartesian  coordinate  system  still 
is  required  to  define  the  origin  plus  the  polar  and  azimuthal  angles  9,  <f>.  Using  spherical  coordinates  for 
a spherically  symmetry  system  allows  the  problem  to  be  factored  into  a cyclic  angular  part,  the  solution 
which  involves  spherical  harmonics  that  are  common  to  all  such  spherically-symmetric  problems,  plus  a 
one-dinrensional  radial  part  that  contains  the  specifics  of  the  particular  spherically-symmetric  potential. 
Similarly,  for  problems  involving  cylindrical  symmetry,  it  is  much  more  convenient  to  use  a cylindrical 
coordinate  system  (p,  <j>,  z).  Again  it  is  necessary  to  use  a cartesian  coordinate  system  to  define  the  origin 
and  angle  </>.  Motion  in  a plane  can  be  handled  using  two  dimensional  polar  coordinates. 


= reduces  to  the  partial  time 

(C.3) 

(C.4) 
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Curvilinear  coordinate  systems  introduce  a complication  in  that  the  unit  vectors  are  time  dependent  in 
contrast  to  cartesian  coordinate  system  where  the  unit  vectors  (i,  j,  k)  are  independent  and  constant  in  time. 
The  introduction  of  this  time  dependence  warrants  further  discussion. 

Each  of  the  three  axes  9;  in  curvilinear  coordinate  systems  can  be  expressed  in  cartesian  coordinates 
(x,  y,  z ) as  surfaces  of  constant  (p  given  by  the  function 

Qi  = fi(x,y,z)  (C.5) 

where  i = 1,2,  or  3.  An  element  of  length  dsi  perpendicular  to  the  surface  qi  is  the  distance  between  the 
surfaces  qi  and  % + dqi  which  can  be  expressed  as 

dsi  = hidqi  (C.6) 

where  hi  is  a function  of  (91,92,©)-  In  cartesian  coordinates  h\,h2,  and  h3  are  all  unity.  The  unit-length 
vectors  q-i , 92,  9,3 • are  perpendicular  to  the  respective  91,92,93  surfaces,  and  are  oriented  to  have  increasing 
indices  such  that  qixq2=  q3.  The  correspondence  of  the  curvilinear  coordinates,  unit  vectors,  and  transform 
coefficients  to  cartesian,  polar,  cylindrical  and  spherical  coordinates  is  given  in  table  (7.1. 


Curvilinear 

9i 

92 

93 

qi 

q2 

q3 

hi 

h2 

h3 

Cartesian 

X 

y 

z 

% 

3 

k 

1 

1 

1 

Polar 

r 

6 

? 

e 

1 

r 

Cylindrical 

p 

V 

z 

p 

Z 

1 

p 

1 

Spherical 

r 

9 

p 

r 

9 

p 

1 

r 

rsin9 

Table  (7.1:  Curvilinear  coordinates 


The  differential  distance  and  volume  elements  are  given  by 

ds  = dsiqi  + ds2  q2  + ds3q3  = h3dqiqi  + h2dq2q2  + h3dq3q3  (C.7) 

dr  = dsids2ds3  = hih2h3(dqidq2dq3)  (C.8) 

These  are  evaluated  below  for  polar,  cylindrical,  and  spherical  coordinates. 

C.2.1  Two-dimensional  polar  coordinates  (r,  9) 

The  complication  and  implications  of  time-dependent  unit  vectors  are  best  illustrated  by  considering  two- 
dimensional  polar  coordinates  which  is  the  simplest  curvilinear  coordinate  system.  Polar  coordinates  are  a 
special  case  of  cylindrical  coordinates,  when  z is  held  fixed,  or  a special  case  of  spherical  coordinate  system, 
when  <j)  is  held  fixed. 

Consider  the  motion  of  a point  P as  it  moves  along  a curve  s (t)  such  that  in  the  time  interval  dt  it  moves 

from  PA)  to  PAI  as  shown  in  figure  (7.2.  The  two-dimensional  polar  coordinates  have  unit  vectors  r,  9, 

which  are  orthogonal  and  change  from  rfi,  9\ , to  r2,  92,  in  the  time  dt.  Note  that  for  these  polar  coordinates 
the  angle  unit  vector  9 is  taken  to  be  tangential  to  the  rotation  since  this  is  the  direction  of  motion  of  a 
point  on  the  circumference  at  radius  r. 

The  net  changes  shown  in  figure  of  table  (7.2  are 

dr  = r2  — r 1 = dr  = |?|  d99  =d99  (C.9) 

since  the  unit  vector  r is  a constant  with  |?  | = 1.  Note  that  the  infinitessimal  dr  is  perpendicular  to  the  unit 
vector  r,  that  is,  dr  points  in  the  tangential  direction  6. 

Similarly,  the  infinitessimal 

d6  = 92~61=d9  = -dOr  (C.10) 

which  is  perpendicular  to  the  tangential  0 unit  vector  and  therefore  points  in  the  direction  — r . The  minus 
sign  causes  — dOr  to  be  directed  in  the  opposite  direction  to  r. 


C.2.  CURVILINEAR  COORDINATE  SYSTEMS 


511 


The  net  distance  element  ds  is  given  by 

ds  =drr  + rdf  =drr  + rdOd 


(C.ll) 


This  agrees  with  the  prediction  obtained  using  table  C.  1. 

The  time  derivatives  of  the  unit  vectors  are  given  by  equations  (C.9)  and  (C.10)  to  be, 


dr  d9  ~ 

dt  dt 

dO  _ dO „ 
dt  dt 


(C.12) 

(C.13) 


Note  that  the  time  derivatives  of  unit  vectors  are  perpendicular  to  the  corresponding  unit  vector,  and  the 
unit  vectors  are  coupled. 

Consider  that  the  velocity  v is  expressed  as 


ar  „ ar  .„  ■- 

— r + r — = rr  + r99 
dt  dt 


(C.14) 


The  velocity  is  resolved  into  a radial  component  f and  an  angular,  transverse,  component  rO. 
Similarly  the  acceleration  is  given  by 


a 


dv  dr  „ .dr  dr d9  ~ -dO 
— = — r +r—  + —99+r—9+r9— 
dt  dt  dt  dt  dt  dt 

(r  — r '92  j r + (r9  + 219^  9 


(C.15) 


■ 2 ^ 

where  the  r9  r term  is  the  effective  centripetal  acceleration  while  the  2 r99  term  is  called  the  Coriolis  term. 
For  the  case  when  f = r = 0,  then  the  first  bracket  in  C.15  is  the  centripetal  acceleration  while  the  second 
bracket  is  the  tangential  acceleration. 

This  discussion  has  shown  that  in  contrast  to  the  time  independence  of  the  cartesian  unit  basis  vectors, 
the  unit  basis  vectors  for  curvilinear  coordinates  are  time  dependent  which  leads  to  components  of  the  velocity 
and  acceleration  involving  coupled  coordinates. 


Coordinates 

r,  9 

Distance  element 

ds  = drr  + rd99 

Area  element 

da  = rdrd9 

Unit  vectors 

r = 1 cos  9 + j sin  9 

9 = —i  sin  9 + j cos  8 

Time  derivatives 

of  unit  vectors 

ft  =M 
# = -» 

Velocity 

v = r r + r99 

Kinetic  energy 

f ( f2+r292 ) 

Acceleration 

a = ( f — r9  j r 

+ (r9  + 2 rO^j  9 

Table  C. 2:  Differential  relations  plus  a diagram  of  the  unit  vectors  for  2-dimensional  polar  coordinates. 
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C.2.2  Cylindrical  Coordinates  (p,  0,  z) 

The  three-dimensional  cylindrical  coordinates  (p,  0,  z ) are  obtained  by  adding  the  motion  along  the  symmetry 
axis  z to  the  case  for  polar  coordinates.  The  unit  basis  vectors  are  shown  in  Table  C. 3 where  the  angular 
unit  vector  0 is  taken  to  be  tangential  corresponding  to  the  direction  a point  on  the  circumference  would 
move.  The  distance  and  volume  elements,  the  cartesian  coordinate  components  of  the  cylindrical  unit 
basis  vectors,  and  the  unit  vector  time  derivatives  are  shown  in  Table  C.3.  The  time  dependence  of  the 
unit  vectors  is  used  to  derive  the  acceleration.  As  for  the  two-dimensional  polar  coordinates,  the  p and  0 
direction  components  of  the  acceleration  for  cylindrical  coordinates  are  coupled  functions  of  p,  p,  p,  0,  and  0. 


Coordinates 

P,  <t>,z 

Distance  element 

ds  = dpp  + pdepep  + dzz 

Volume  element 

dv  = pdpd(j)dz 

Unit  vectors 

p = i cos  ef)+  j sin  0 

0 = — i sin  ef>  + j cos  0 

z = k 

Time  derivatives 

of  unit  vectors 

%= 
t = o 

Velocity 

v = pp  + pef)(f>  + zz 

Kinetic  energy 

f (p2V0  +i2) 

Acceleration 

a = (p  - P0")  P 
+ (p0  + 2p0)  0 + ZZ 

Table  C. 3:  Differential  relations  plus  a diagram  of  the  unit  vectors  for  cylindrical  coordinates. 

C.2.3  Spherical  Coordinates  (r,  9,  0) 

The  three  dimensional  spherical  coordinates,  can  be  treated  the  same  way  as  for  cylindrical  coordinates.  The 
unit  basis  vectors  are  shown  in  Table  CA  where  the  angular  unit  vectors  0 and  0 are  taken  to  be  tangential 
corresponding  to  the  direction  a point  on  the  circumference  moves  for  a positive  rotation  angle. 


Coordinates 

r,  0,0 

Distance  element 

ds  = drv  + rddO  + r sin  ddcjxf) 

Volume  element 

dv  = rz  sin  QdrdOdcf) 

Unit  vectors 

r = i sin  0 cos  0 + j sin  6 sin  0 + k cos  9 

0 = i cos  6 cos  0 + j cos  9 sin  0 — k sin  9 

0 = — i sin  0 + j cos  0 

Time  derivatives 

of  unit  vectors 

§ =09  + 00  sin  9 

= — r 9 + 00  cos  9 
^ = — r0sin0  — Ocf)  cos  9 

Velocity 

v = rr  + r90  + ref)  sin  9<p 

Kinetic  energy 

m ^2_|_r2p  _|_r2  gjn2  j 

Acceleration 

a = [r  — rO  — ref)  sin2  9 j r 
+ (r9  + 2 fO  — ref)  sin 9 cos 9^j  0 
+ (rep  sin  9 + 2 ref>  sin  9 + 2 r9ef>  cos  9 j 0 

Table  CA  Differential  relations  plus  a diagram  of  the  unit  vectors  for  spherical  coordinates. 
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The  distance  and  volume  elements,  the  cartesian  coordinate  components  of  the  spherical  unit  basis 
vectors,  and  the  unit  vector  time  derivatives  are  shown  in  the  table  given  in  figure  C.4.  The  time  dependence 
of  the  unit  vectors  is  used  to  derive  the  acceleration.  As  for  the  case  of  cylindrical  coordinates,  the  r , 9,  and 
0 components  of  the  acceleration  involve  coupling  of  the  coordinates  and  their  time  derivatives. 

It  is  important  to  note  that  the  angular  unit  vectors  9 and  0 are  taken  to  be  tangential  to  the  circles  of 
rotation.  However,  for  discussion  of  angular  velocity  of  angular  momentum  it  is  more  convenient  to  use  the 
axes  of  rotation  defined  by  r x 9 and  r x 0 for  specifying  the  vector  properties  which  is  perpendicular  to 
the  unit  vectors  9 and  0.  Be  careful  not  to  confuse  the  unit  vectors  9 and  0 with  those  used  for  the  angular 
velocities  9 and  0. 


C.3  Frenet-Serret  coordinates 

The  cartesian,  polar,  cylindrical,  or  spherical  curvilinear  coordinate  systems,  all  are  orthogonal  coordinate 
systems  that  are  fixed  in  space.  There  are  situations  where  it  is  more  convenient  to  use  the  Frenet-Serret 
coordinates  which  comprise  an  orthogonal  coordinate  system  that  is  fixed  to  the  particle  that  is  moving 
along  a continuous,  differentiable,  trajectory  in  three-dimensional  Euclidean  space.  Let  s(t)  represent  a 
monotonically  increasing  arc-length  along  the  trajectory  of  the  particle  motion  as  a function  of  time  t.  The 
Frenet-Serret  coordinates,  shown  in  figure  C. 5,  are  the  three  instantaneous  orthogonal  unit  vectors  t,  n,  and 
b where  the  tangent  unit  vector  t is  the  instantaneous  tangent  to  the  curve,  the  normal  unit  vector  n is  in 
the  plane  of  curvature  of  the  trajectory  pointing  towards  the  center  of  the  instantaneous  radius  of  curvature 
and  is  perpendicular  to  the  tangent  unit  vector  t,  while  the  binormal  unit  vector  is  b = t x n which  is  the 
perpendicular  to  the  plane  of  curvature  and  is  mutually  perpendicular  to  the  other  two  Frenet-Serrat  unit 
vectors.  The  Frenet-Serret  unit  vectors  are  defined  by  the  relations 


dt 

ds 

dh 

ds 

dh 

ds 


KU 

— rn 
— /tt+rb 


(C.16) 

(C.17) 

(C.18) 


The  curvature  k = - where  p is  the  radius  of  curvature  and  r is  the  torsion  that  can  be  either  positive 
or  negative.  For  increasing  s,  a non-zero  curvature  k implies  that  the  triad  of  unit  vectors  rotate  in  a 
right-handed  sense  about  b.  If  the  torsion  r is  positive  (negative)  the  triad  of  unit  vectors  rotates  in  right 
(left)  handed  sense  about  t. 


Distance  element 

ds(t)  = t 

dt  = t v(t)dt 

Unit  vectors 

- |„_(t)| 

nW  = JSEMf 

b(t)=t  x n 

Time  derivatives 

of  unit  vectors 

/ t \ 

Tt[  “ = M 

V b / 

( 0 K 0 \ / t \ 

—k  0 r j|  n 

^ o -r  o ) y b J 

Velocity 

v(t)  = ^ 

Acceleration 

a(t)  = §t+KU2n 

Table  C. 5.  The  differential  relations  plus  a diagram  of  the  corresponding  unit  vectors  for  the  Frenet-Serret 
coordinate  system. 
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The  above  equations  also  can  be  rewritten  in  the  form  using  a new  unit  rotation  vector  u where 

w=rt+Kb  (C.19) 

Then  equations  (7.16  — (7.18  are  transformed  to 

wxt  (C.20) 

uj  x n (C.21) 

wxb  (C.22) 

In  general  the  Frenet-Serret  unit  vectors  are  time  dependent.  If  the  curvature  k = 0 then  the  curve  is  a 
straight  line  and  n and  b are  not  well  defined.  If  the  torsion  is  zero  then  the  trajectory  lies  in  a plane.  Note 
that  a helix  has  constant  curvature  and  constant  torsion. 

The  rate  of  change  of  a general  vector  field  E along  the  trajectory  can  be  written  as 


G?t 

ds 

dn 

ds 

dh 

ds 


dE 

ds 


dEn  „ dEb  - \ 
lhD+-i 7h) 


+ oj  x E 


(C.23) 


The  Frenet-Serret  coordinates  are  used  in  the  life  sciences  to  describe  the  motion  of  a moving  organism 
in  a viscous  medium.  The  Frenet-Serret  coordinates  also  have  applications  to  General  Relativity. 


Workshop  exercises 

1.  The  goal  of  this  problem  is  to  help  you  understand  the  origin  of  the  equations  that  relate  two  different  coordinate 
systems.  Refer  to  diagrams  for  cylindrical  and  spherical  coordinates  as  your  teaching  assistant  explains  how  to 
arrive  at  expressions  for  X\,X2,  and  X$  in  terms  of  p,  <fi,  and  2:  and  how  to  derive  expressions  for  the  velocity  and 
acceleration  vectors  in  cylindrical  coordinates.  Now  try  to  relate  spherical  and  rectangular  coordinate  systems. 
Your  group  should  derive  expressions  relating  the  coordinates  of  the  two  systems,  expressions  relating  the  unit 
vectors  and  their  time  derivatives  of  the  two  systems,  and  finally,  expressions  for  the  velocity  and  acceleration 
in  spherical  coordinates. 


Appendix  D 

Coordinate  transformations 


Coordinate  systems  can  be  translated,  or  rotated  with  respect  to  each  other  as  well  as  being  subject  to  spatial 
inversion  or  time  reversal.  Scalars,  vectors,  and  tensors  are  defined  by  their  transformation  properties  under 
rotation,  spatial  inversion  and  time  reversal,  and  thus  such  transformations  play  a pivotal  role  in  physics. 


D.l  Translational  transformations 

Translational  transformations  are  involved  frequently  for  transforming  between  the  center  of  mass  and  lab- 
oratory frames  for  reaction  kinematics  as  well  as  when  performing  vector  addition  of  central  forces  for  the 
cases  where  the  centers  are  displaced.  Both  the  classical  Galilean  transformation  or  the  relativistic  Lorentz 
transformation  are  handled  the  same  way.  Consider  two  parallel  orthonormal  coordinate  frames  where  the 
origin  of  F'  (a/,  y',  z')  is  displaced  by  a time  dependent  vector  a (t)  from  the  origin  of  frame  F (x,  y,  z).  Then 
the  Galilean  transformation  for  a vector  r in  frame  F to  r'  in  frame  F'  is  given  by 

r(x',y',z')  = r(x,y,z)+a(t)  (D.l) 

The  velocities  for  a moving  frame  are  given  by  the  vector  difference  of  the  velocity  in  a stationary  frame, 
and  the  velocity  of  the  origin  of  the  moving  frame.  Linear  accelerations  can  be  handled  similarly. 


D.2  Rotational  transformations 

D.2.1  Rotation  matrix 

Rotational  transformations  of  the  coordinate  system  are  used  extensively  in  physics.  The  transformation 
properties  of  fields  under  rotation  define  the  scalar  and  vector  properties  of  fields,  as  well  as  rotational 
symmetry  and  conservation  of  angular  momentum. 

Rotation  of  the  coordinate  frame  does  not  change  the  value  of  any  scalar  observable  such  as  mass, 
temperature  etc.  That  is,  transformation  of  a scalar  quantity  is  invariant  under  coordinate  rotation  from 
x,y,z  ->  x',y',z'. 

<t>{x'y'z')  = (j>(xyz)  (D.2) 

By  contrast,  the  components  of  a vector  along  the  coordinate  axes  change  under  rotation  of  the  coordinate 
axes.  This  difference  in  transformation  properties  under  rotation  between  a scalar  and  a vector  is  important 
and  defines  both  scalars  and  a vectors. 

Matrix  mechanics,  described  in  appendix  A,  provides  the  most  convenient  way  to  handle  coordinate 
rotations.  The  transformation  matrix,  between  coordinate  systems  having  differing  orientations  is  called  the 
rotation  matrix.  This  transforms  the  components  of  any  vector  with  respect  to  one  coordinate  frame  to 
the  components  with  respect  to  a second  coordinate  frame  rotated  with  respect  to  the  first  frame. 

Assume  a point  P has  coordinates  (x\,X2,xz)  with  respect  to  a certain  coordinate  system.  Consider 
rotation  to  another  coordinate  frame  for  which  the  point  P has  coordinates  {x\ , x'2 , x'3)  and  assume  that  the 
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origins  of  both  frames  coincide.  Rotation  of  a frame  does  not  change  the  vector,  only  the  vector  components 
of  the  unit  basis  states.  Therefore 

x = tq  x\  + e'2x2  + e'3x'3  = eiaq  + e2X2  + e3X3  (D.3) 

Note  that  if  one  designates  that  the  unit  vectors  for  the  unprimed  coordinate  frame  are  (ei,e2,e3)  and  for 
the  primed  coordinate  frame  (e'| , e2,  e3),  then  taking  the  scalar  product  of  equation  D. 3 sequentially  with 
each  of  the  unit  base  vectors  (e'1;  e2,  e3)  leads  to  the  following  three  relations 

x'i  = (e^-e^X!  + (e'1-e2)x2  + (e,1-e3)a:3  (D.4) 

x2  = (e2-ei)a;i  + (e2-e2)a:2  + (e'2-e3)a:3 

x3  = (e3-e!)a;i  + (e3-e2)a:2  + (e3-e3)a:3 

Note  that  the  (e'-e,)  are  the  direction  cosines  as  defined  by  the  scalar  product  of  two  unit  vectors  for  axes 
i,j,  that  is,  they  are  the  cosine  of  the  angle  between  the  two  unit  vectors. 

Equation  DA  can  be  written  in  matrix  form  as 

x.'  = A • x (D.5) 

where  the  ” • ” means  the  inner  matrix  product  of  the  rotation  matrix  A and  the  vector  x where 

/ x'i  \ ( xi  \ / e'rei  e're2  e're3  \ 

x'  =1  x'2  I x=  \ x2  I e2-ei  e2e2  e2e3  I (D.6) 

\ x3  / \ x3  / V ®3‘®1  e3  g2  e3-e3  ) 

The  inverse  procedure  is  obtained  by  multiplying  equation  D. 3 successively  by  one  of  the  unit  basis 
vectors  (e!,e2,e3)  leading  to  three  equations 


xi  = (ei-ejxi  + (ere2)a;2  + (ere3)a;3 
x2  = (e2-e/1)x/1  + (e2-e2)a;2  + (e2-e3)a;3 

X3  = (es-ei)^  + (e3-e2)a:2  + (e3-e3)a:3 


Ecpiation  D. 7 can  be  written  in  matrix  form  as 


x = Atx' 


where  Ar  is  the  transpose  of  A. 

Note  that  substituting  equation  D. 5 into  equation  D. 8 gives 


x = At-  (A  • x)  = (at-a) 


At-A  ) = I 


where  I is  the  identity  matrix.  This  implies  that  the  rotation  matrix  A is  orthogonal  with  AT  = A 1 . 
It  is  convenient  to  rename  the  elements  of  the  rotation  matrix  to  be 


Xij  — (e-ej) 

so  that  the  rotation  matrix  is  written  more  compactly  as 


(D.10) 


An  Ai2  A13 

A2i  A22  A23 
A31  A312  A33 


and  equation  DA  becomes 


Anxi  + Xi2x2  + A13X3 
X2iXi  + A22a.’2  + A23X3 
A31X1  + A32X2  + X33X3 


(D.ll) 
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Consider  an  arbitrary  rotation  through  an  angle  9.  Equations  (-B.10)  and  (-B.11)  can  be  used  to  relate 
six  of  the  nine  quantities  A ij  in  the  rotation  matrix,  so  only  three  of  the  quantities  are  independent.  That 
is,  because  of  equation  (J3.ll)  we  have  three  equations  which  ensure  that  the  transformation  is  unitary. 

^il  + ^i2  + = 1 (D.12) 

Also  requiring  that  the  axes  be  orthogonal  gives  three  equations 

y^Xkj  = Oj  i^k  (D.13) 

3 

These  six  relations  can  be  expressed  as 

^ ' XijXfcj  = $ik  (E.14) 

3 

The  fact  that  the  rotation  matrix  should  have  three  independent  quantities  is  due  to  the  fact  that  all  rotations 
can  be  expressed  in  terms  of  rotations  about  three  orthogonal  axes. 

D.l  Example:  Rotation  matrix: 

Consider  a point  P{x  1,22,23)  = P(3,4, 5)  in  the  unprimed  coordinate  system.  Consider  the  same  point 
P(x'1,x2,x/3)  in  the  primed  coordinate  system  which  has  been  rotated  by  an  angle  60°  about  the  x\  axis  as 
shown.  The  direction  cosines  Xpj=cos( 9Pj)  can  be  determined  from  the  figure  to  be  the  following 


l 

j 

Opj 

Aj/j=cos 

1 

1 

0 

1 

1 

2 

90 

0 

1 

3 

90 

0 

2 

1 

90 

0 

2 

2 

60 

0.500 

2 

3 

90-60 

0.866 

3 

1 

90 

0 

3 

2 

90  + 60 

-0.866 

3 

3 

60 

0.500 

Thus  the  rotation  matrix  is 


X = 


1.  0 0 \ 

0 0.500  0.866 

0 —0.866  0.500  ) 


The  transform  point  P' (x^,  x2,  x3)  therefore  is  given  by 


x’i 

x2 

no1 


1.  0 0 \ / 3 

0 0.500  0.866  - 4 

0 -0.866  0.500  ) \ 5 


3 \ 

6.330 

-0.964  ) 


Note  that  the  radial  coordinate  rp=  r'P=V 50- 
the  magnitude  of  the  vector  is  unchanged. 


That  is,  the  rotational  transformation  is  unitary  and  thus 
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D.2  Example:  Proof  that  a rotation  matrix  is  orthogonal 


Consider  the  rotation  matrix 


The  product 


XT 


4 1 8 

7 4-4 

-4  8 1 


4 7-4 

1 4 8 

8 -4  1 


1 

81 


81  0 0 
0 81  0 
0 0 81 


which  implies  that  A is  orthogonal. 


1 


D.2. 2 Finite  rotations 


Consider  two  finite  90°  rotations  A^  and 
A b illustrated  in  figure  D.  1.  The  A a ro- 
tation is  90°  around  the  X3  axis  in  a 
right-handed  direction  as  shown.  In  such 
a rotation  the  axes  transform  to  x\  = X2 , 
x'2  = —xi,  x'3  = X3  and  the  rotation  matrix 
is 

0 1 0 A 

-10  0 (D.15) 

0 0 1/ 

The  second  rotation  XB  is  a right-handed 
rotation  about  the  x\  axis  which  formerly 
was  the  axis.  Then  x\  = x'2,  ^”2  = —x[, 
x3  = x3  and  the  rotation  matrix  is 

10  0 A 

0 0 1 (D.16) 

0-10/ 

Consider  the  product  of  these  two  finite  ro- 
tations which  corresponds  to  a single  rota- 
tion matrix  A ab 


Figure  D.l:  Order  of  two  finite  rotations  for  a parallelepiped. 


Aab  = AbXa 


(D.17) 


That  is: 


0 0 W0  1 0 A / 0 1 0 A 

0 1-1  00  = 001 

-1  0 / \ 0 0 1/  \ 1 0 0 / 


Now  consider  that  the  order  of  these  two  rotations  is  reversed. 


That  is: 


Aba  = AaXb 


(°  1 

XBa  =1—10 

\ 0 0 


0 W 1 0 0 

0 0 0 1 

1 / \ 0 -1  0 


0 0 1 A 

-10  0 } fXAB 

0 -10  / 


(D.18) 


(D.19) 


(D.20) 


An  entirely  different  orientation  results  as  illustrated  in  figure  D.l. 

This  behavior  of  finite  rotations  is  a consequence  of  the  fact  that  finite  rotations  do  not  commute , that 
is,  reversing  the  order  does  not  give  the  same  answer.  Thus,  if  we  associate  the  vectors  A and  B with 
these  rotations,  then  it  implies  that  the  vector  product  AB  7^  BA.  That  is,  for  finite  rotation  matrices,  the 
product  does  not  behave  like  for  true  vectors  since  they  do  not  commute. 
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D.2.3  Infinitessimal  rotations 


Infinitessimal  rotations  do  not  suffer  from  the  noncommutation  defect 
of  finite  rotations.  If  the  position  vector  of  a point  changes  from  r to 
r + fir  then  the  geometrical  situation  is  represented  correctly  by 

fir  = 5Q  x r (D.21) 

where  89  is  a quantity  whose  magnitude  is  equal  to  the  infinitessimal 
rotation  angle  and  which  has  a direction  along  the  instantaneous  axis 
of  rotation  as  illustrated  in  figure  D. 2. 

The  infinitessimal  angle  56  is  a vector  which  is  shown  by  proving 
that  two  infinitessimal  rotations  86 1 and  86 2 commute.  The  change 
in  position  vectors  of  the  point  are 


fir!  = 59 1 x r (D.22) 

and 

fir2  = 592  x (r  + firx)  (D.23) 

Thus  the  final  position  vector  for  56  \ followed  by  892  is 

r + firx  + fir2  = r + 561  x r + 592  x (r  + firx)  (D.24) 


Figure  D.2:  Infinitessimal  rotation 


Assuming  that  the  second-order  infinitessimals  can  be  ignored  gives 
r + firx  + 5r2  = r + 59i  x r + 862  x r (D.25) 


Consider  now  the  inverse  order  of  rotations. 


r + fir2  + firx  = r + S92  x r + 86 x x (r  + fir2) 
Again,  neglecting  the  second-order  infinitessimals  gives 

r + fir2  + firx  = r + 592  x r + 89 x x r 


(D.26) 


(D.27) 


Note  that  the  products  of  these  two  infinitessimal  rotations,  D25  and  D 27  are  identical.  That  is,  assuming 
that  second-order  infinitessimals  can  be  neglected,  then  the  infinitessimal  rotations  commute,  and  thus  86 x 
and  592  are  correctly  represented  by  vectors. 

The  fact  that  59  is  a vector  allows  angular  velocity  to  be  represented  by  a vector.  That  is,  angular 
velocity  is  the  ratio  of  an  infinitessimal  rotation  to  an  infinitessimal  time. 


uj  = 


fie 

fit 


(D.28) 


Note  that  this  implies  that  the  velocity  of  the  point  can  be  expressed  as 


fir  59 

v = — = — xr  = a>xr 

fit  fit 


(D.29) 


D.2. 4 Proper  and  improper  rotations 

The  requirement  that  the  coordinate  axes  be  orthogonal,  and  that  the  transformation  be  unitary,  leads  to 
the  relation  between  the  components  of  the  rotation  matrix. 

^ \j^kj  = $ik  (D.30) 

i 

It  was  shown  in  equation  A.  12  that,  for  such  an  orthogonal  matrix,  the  inverse  matrix  A-1  equals  the 
transposed  matrix  \T 

A 1 = At 
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Inserting  the  orthogonality  relation  for  the  rotation  matrix  leads  to  the  fact  that  the  square  of  the  determinant 
of  the  rotation  matrix  equals  one, 

|A|2  = 1 (D.31) 

that  is 

|A|  = ±1  (D.32) 

A proper  rotation  is  the  rotation  of  a normal  vector  and  has 

|A|  = +1  (D.33) 

An  improper  rotation  corresponds  to 

|A|  = -1  (D.34) 

An  improper  rotation  implies  a rotation  plus  a spatial  reflection  which  cannot  be  achieved  by  any  combination 
of  only  rotations. 

Consider  the  cross  product  of  two  vectors  c = a x b.  It  can  be  shown  that  the  cross  product  behaves 
under  rotation  as: 

ci  = I'M  'y  ^ ijcj  (D.35) 

j 

For  all  proper  rotations  the  determinant  of  A = +1  and  thus  the  cross  product  also  acts  like  a proper  vector 
under  rotation.  This  is  not  true  for  improper  rotations  where  |A|  = —1. 

D.3  Spatial  inversion  transformation 

Spatial  inversion,  that  is,  mirror  reflection,  corresponds  to  reflection  of  all  coordinate  vectors,  i = — i,  j = — 
j , and  k = — k.  Such  a transformation  corresponds  to  the  transformation  matrix 

/ -1  0 0 \ / 1 0 0 \ 

A = | 0 -10  =-  0 10  (D.36) 

\ 0 0 -1/  \ 0 0 1 / 

Thus  |A|  = — 1,  that  is,  it  corresponds  to  an  improper 
rotation.  A spatial  inversion  for  two  vectors  A (r)  and 
B(r)  correspond  to 

A (r)  = -A(-r)  (D.37) 

B(r)  = -B(-r) 

That  is,  normal  polar  vectors  change  sign  under  spa- 
tial reflection.  However,  the  cross  product  C = A x B 
does  not  change  sign  under  spatial  inversion  since  the 
product  of  the  two  minus  signs  is  positive.  That  is, 

C(r)  = +C(— r)  (D.38) 

Figure  D.3:  Inversion  of  an  object  corresponds  to 
Thus  the  cross  product  behaves  differently  from  a polar  reflection  about  the  origin  of  all  axes, 
vector.  This  improper  behavior  is  characteristic  of  an 
axial  vector,  which  also  is  called  a pseudovector. 

Examples  of  pseudovectors  are  angular  momentum,  spin,  magnetic  field  etc.  These  pseudovectors  are 
defined  using  the  right-hand  rule  and  thus  have  handedness.  For  a right-handed  system 

CR=  A x B (D.39) 

Changing  to  a left-handed  system  leads  to 


C l=  B x A = —A  x B 


(D.40) 


DA.  TIME  REVERSAL  TRANSFORMATION 
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That  is,  handedness  corresponds  to  a definite  ordering  of  the  cross  product.  Proper  orthogonal  transforma- 
tions are  said  to  preserve  chirality  (Greek  for  handedness)  of  a coordinate  system. 

An  example  of  the  use  of  the  right-handed  system  is  the  usual  definition  of  cartesian  unit  vectors, 

i x j = k (D.41) 


An  obvious  question  to  be  asked,  is  the  handedness  of  a coordinate  system  merely  a mathematical  curiosity 
or  does  it  have  some  deep  underlying  significance?  Consider  the  Lorentz  force 

F = 5 (E  + v x B)  (D.42) 

Since  force  and  velocity  are  proper  vectors  then  the  magnetic  B field  must  be  a pseudo  vector.  Note  that 
calculation  of  the  B field  occurs  only  in  cross  products  such  as, 


V x B = /j. j 


(D.43) 


where  the  current  density  j is  a proper  vector.  Another  example  is  the  Biot-Savart  Law  which  expresses  B 
as 


dB 


H0I  dl  x r 

4tt  r2 


(D.44) 


Thus  even  though  B is  a pseudo  vector,  the  force  F remains  a proper  vector.  Thus  if  a left-handed  coordinate 
definition  of  B^  = is  used  in  D.44,  and  F = q (E  + B^xv)  in  D.42,  then  the  same  final  physical 

result  would  be  obtained. 

It  was  long  thought  that  the  laws  of  physics  were  symmetric  with  respect  to  spatial  inversion  ( i.e.  mirror 
reflection),  meaning  that  the  choice  between  a left-handed  and  right-handed  representations  (chirality)  was 
arbitrary.  This  is  true  for  gravitational,  electromagnetic  and  the  strong  force,  and  is  called  the  conservation 
of  parity.  The  fourth  fundamental  force  in  nature,  the  weak  force,  violates  parity  and  favours  handedness. 
It  turns  out  that  right-handed  ordinary  matter  is  symmetrical  with  left-handed  antimatter. 

In  addition  to  the  two  flavours  of  vectors,  one  has  scalars  and  pseudoscalars  defined  by: 


</>(r)  = +(/)(— r)  (D.45) 

<t>{r)  = —<j>{—r)  (D.46) 


An  example  of  a pseudoscalar  is  the  scalar  product  A • (B  x C) 


D.4  Time  reversal  transformation 

The  basic  laws  of  classical  mechanics  are  invariant  to  the  sense  of  the  direction  of  time.  Under  time  reversal 
the  vector  r is  unchanged  while  both  momentum  p and  time  t change  sign  under  time  reversal,  thus  the  time 
derivative  F is  invariant  to  time  reversal;  that  is,  the  force  is  unchanged  and  Newton’s  Laws  F = ^ 
are  invariant  under  time  reversal.  Since  the  force  can  be  expressed  as  the  gradient  of  a scalar  potential  for 
a conservative  field,  then  the  potential  also  remains  unchanged.  That  is 

^ = -VD(r)  = F (D.47) 

It  is  necessary  to  introduce  tensor  algebra,  given  in  appendix  E,  prior  to  discussion  of  the  transformation 
properties  of  observables  which  is  the  topic  of  appendix  E 5. 


Workshop  exercises 

1.  Suppose  the  aq-axis  of  a rectangular  coordinate  system  is  rotated  by  30°  away  from  the  aq-axis  around  the 
aq-axis. 


(a)  Find  the  corresponding  transformation  matrix.  Try  to  do  this  by  drawing  a diagram  instead  of  going  to 
the  book  or  the  notes  for  a formula. 
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(b)  Is  this  an  orthogonal  matrix?  If  so,  show  that  it  satisfies  the  main  properties  of  an  orthogonal  matrix.  If 
not,  explain  why  it  fails  to  be  orthogonal. 

(c)  Does  this  matrix  represent  a proper  or  an  improper  rotation?  How  do  you  know? 


2.  When  you  were  first  introduced  to  vectors,  you  most  likely  were  told  that  a scalar  is  a quantity  that  is  defined 
by  a magnitude,  while  a vector  has  both  a magnitude  and  a direction.  While  this  is  certainly  true,  there  is 
another,  more  sophisticated  way  to  define  a scalar  quantity  and  a vector  quantity:  through  their  transformation 
properties.  A scalar  quantity  transforms  as  (f)'  — (j)  while  a vector  quantity  transforms  as  A { = Aj jAj.  To 

show  that  the  scalar  product  does  indeed  transform  as  a scalar,  note  that: 


A'-B' 


E aw  = E E AbA-  E x^Bk 

i \ j / V k , 

E ( E \ AjBj  = A ■ B 


AjBk 


Now  you  will  show  that  the  vector  product  transforms  as  a vector.  Begin  by  writing  out  what  you  are  trying 
to  show  explicitly  and  show  it  to  the  teaching  assistant.  Once  the  teaching  assistant  has  confirmed  that  you 
have  the  correct  expression,  try  to  prove  it.  The  vector  product  is  a bit  more  difficult  to  work  with  than  the 
scalar  product,  so  your  teaching  assistant  is  prepared  to  give  you  a hint  if  you  get  stuck. 

3.  Suppose  you  have  two  rectangular  coordinate  systems  that  share  a common  origin,  but  one  system  is  rotated 
by  an  angle  9 with  respect  to  the  other.  To  describe  this  rotation,  you  have  made  use  of  the  rotation  matrix 
A (9).  (I’m  changing  the  notation  slightly  to  put  the  emphasis  on  the  angle  of  rotation.) 


(a)  Verify  that  the  product  of  two  rotation  matrices  A($i)A($2)  is  hr  itself  a rotation  matrix. 

(b)  In  abstract  algebra,  a group  G is  defined  as  a set  of  elements  g together  with  a binary  operation  * acting 
on  that  set  such  that  four  properties  are  satisfied: 

i.  (Closure)  For  any  two  elements  gi  and  g.j  in  the  group  G . the  product  of  the  elements,  gj  * gj  is  also 
in  the  group  G. 

ii.  (Associativity)  For  any  three  elements  g j,  gj,  gk  of  the  group  G , ( gi  * gj)  * gk  = gi*  {gj  * gk)- 

iii.  (Existence  of  Identity)  The  group  G contains  an  identity  element  e such  that  g*e  = e*g  = g for 
all  g £ G . 

iv.  (Existence  of  Inverses)  For  each  element  g £ G,  there  exists  an  inverse  element  g £ G such  that 
9*9~X  =S~1  *9  = e. 

Show  that  if  the  product  * denotes  the  product  of  two  matrices,  then  the  set  of  rotation  matrices  together 
with  * forms  a group.  This  group  is  known  as  the  special  orthogonal  group  in  two  dimensions,  also  known 
as  SO{ 2). 

(c)  Is  this  group  commutative?  In  abstract  algebra,  a commutative  group  is  called  an  abelian  group. 


4.  When  you  look  in  a mirror  the  image  of  you  appears  left-to-right  reversed,  that  is,  the  image  of  your  left  ear 
appears  to  be  the  right  ear  of  the  image  and  vise  versa.  Explain  why  the  image  is  left-right  reversed  rather 
than  up-down  reversed  or  reversed  about  some  other  axis;  i.e.  explain  what  breaks  the  symmetry  that  leads  to 
these  properties  of  the  mirror  image. 


Problems 

[1]  Find  the  transformation  matrix  that  rotates  the  axis  X$  of  a rectangular  coordinate  system  45°  toward  X\  around 
the  X2  axis. 

[2]  For  simplicity,  take  A to  be  a two-dimensional  transformation  matrix.  Show  by  direct  expansion  that  |A|“  = 1. 


Appendix  E 

Tensor  algebra 


E.l  Tensors 


Mathematically  scalars  and  vectors  are  the  first  two  members  of  a hierarchy  of  entities,  called  tensors, 
that  behave  under  coordinate  transformations  as  described  in  appendix  D.  The  use  of  the  tensor  notation 
provides  a compact  and  elegant  way  to  handle  transformations  in  physics. 

A scalar  is  a rank  0 tensor  with  one  component,  that  is  invariant  under  change  of  the  coordinate  system. 

<j>(x'y'z')  = (j>(xyz)  (E.l) 


A vector  is  a rank  1 tensor  which  has  three  components,  that  transform  under  rotation  according  to 
matrix  relation 

x'  = A • x (E.2) 

where  A is  the  rotation  matrix.  Equation  E2  can  be  written  in  the  suffix  form  as 


x 


i 


3 

t x ijxj 

3 = 1 


(E.3) 


The  above  definitions  of  scalars  and  vectors  can  be  subsumed  into  a class  of  entities  called  tensors  of  rank  n 
that  have  3”  components.  A scalar  is  a tensor  of  rank  r = 0,  with  only  3°  = 1 component,  whereas  a vector 
has  rank  r = 1,  that  is,  the  vector  x has  one  suffix  i and  31  = 3 components. 

A second-order  tensor  X!y  has  rank  r = 2 with  two  suffixes,  that  is,  it  has  32  = 9 components  that 
transform  under  rotation  as 

3 3 

XikXjiTu  (E.4) 

k= 1 1=1 

For  second-order  tensors,  the  transformation  formula  given  by  equation  EA  can  be  written  more  compactly 
using  matrices.  Thus  the  second-order  tensor  can  be  written  as  a 3 x 3 matrix 


Tn 

T12 

Tis 

E? 

Ill 

H 

T-22 

T23 

V t31 

T32 

T33 

(E.5) 


The  rotational  transformation  given  in  equation  EA  can  be  written  in  the  form 


t=EE  XikTki  ) A ji  = EE  j Aj 


1= 1 \k= 1 


1=1  \k=l 


(E.6) 


where  A fj  are  the  matrix  elements  of  the  transposed  matrix  XT . The  summations  in  E.6  can  be  expressed 
in  both  the  tensor  and  conventional  matrix  form  as  the  matrix  product 

T'  = A T At  (E.7) 


Equation  El  defines  the  rotational  properties  of  a spherical  tensor. 
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E.2  Tensor  products 

E.2.1  Tensor  outer  product 

Tensor  products  feature  prominently  when  using  tensors  to  represent  transformations.  A second-order  tensor 
T can  be  formed  by  using  the  tensor  product,  also  called  outer  product,  of  two  vectors  a and  b which, 
written  in  suffix  form,  is 

(dibi  a\b2  0163 
a2bi  a2b2  a2b3 
(J3&1  a3b2  a3b3 

In  component  form  the  matrix  elements  of  this  matrix  are  given  by 

Tij  = a.ibj  (E.9) 

This  second-order  tensor  product  has  a rank  r = 2,  that  is,  it  equals  the  sum  of  the  ranks  of  the  two 
vectors.  Equation  E 8 is  called  a dyad  since  it  was  derived  by  taking  the  dyadic  product  of  two  vectors.  In 
general,  multiplication,  or  division,  of  two  vectors  leads  to  second-order  tensors.  Note  that  this  second-order 
tensor  product  completes  the  triad  of  tensors  possible  taking  the  product  of  two  vectors.  That  is,  the  scalar 
product  a • b,  has  rank  r = 0,  the  vector  product  a x b,  rank  r = 1 and  the  tensor  product  a ® b has  rank1 
r = 2. 

Higher-order  tensors  can  be  created  by  taking  more  complicated  tensor  products.  For  example,  a rank-3 
tensor  can  be  created  by  taking  the  tensor  outer  product  of  the  rank- 2 tensor  Tij  and  a vector  Ck  which,  for 
a dyadic  tensor,  can  be  written  as  the  tensor  product  of  three  vectors.  That  is, 

Tijk  — 'Rij^k  — CLibjCk  (E- 10) 

In  summary,  the  rank  of  the  tensor  product  equals  the  sum  of  the  ranks  of  the  tensors  included  in  the  tensor 
product. 


E.2. 2 Tensor  inner  product 

The  lowest  rank  tensor  product,  which  is  called  the  inner  product,  is  obtained  by  taking  the  tensor  product 
of  two  tensors  for  the  special  case  where  one  index  is  repeated,  and  taking  the  sum  over  this  repeated  index. 
Summing  over  this  repeated  index,  which  is  called  contraction,  removes  the  two  indices  for  which  the  index 
is  repeated,  resulting  in  a tensor  that  has  rank  r equal  to  the  sum  of  the  ranks  minus  2 for  one  contraction. 
That  is,  the  product  tensor  has  rank  r = rq  + r2  — 2. 

The  simplest  example  is  the  inner  product  of  two  vectors  which  has  rank  r = 1 + 1 — 2 = 0,  that  is,  it  is 
the  scalar  product  that  equals  the  trace  of  the  inner  product  matrix,  and  this  inner  product  is  commutative. 

An  especially  important  case  is  the  inner  product  of  a rank-2  dyad  a ® b,  given  by  equation  E 8,  with  a 
vector  c,  that  is,  the  inner  product  T = a (g  b • c.  Written  in  component  form,  the  inner  product  is 

3 / 3 

^2  dibiCj  = f X/  a,hl 

i \ i 

The  scalar  product  a • b is  a scalar  number,  and  thus  the  inner-product  tensor  is  the  vector  c renormalized 
by  the  magnitude  of  the  scalar  product  a • b.  That  is,  it  has  a rank  r = 2 + l — 2 = 1.  Thus  the  inner  product 
of  this  rank- 2 tensor  with  a vector  gives  a vector.  The  inner  product  of  a rank-2  tensor  with  a rank-1  tensor 
is  used  in  this  book  for  handling  the  rotation  matrix,  the  inertia  tensor  for  rigid-body  rotation,  and  for  the 
stress  and  the  strain  tensors  used  to  describe  elasticity  in  solids. 

E.l  Example:  Displacement  gradient  tensor 

The  displacement  gradient  tensor  provides  an  example  of  the  use  of  the  matrix  representation  to  manipu- 
late tensors.  Let  <f(xi,x2,x3)  be  a vector  field  expressed  in  a cartesian  basis.  The  definition  of  the  gradient 
G = V(p  gives  that 

dtp  = G-dx 

1The  common  convention  is  to  denote  the  scalar  product  as  a • b,  the  vector  product  as  a X b,  and  tensor  product  as  a (g)  b. 


Cj  = (a  • b)  Cj 


(E.ll) 
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Calculating  the  components  of  dcp  in  terms  of  x gives 

#1  = 

d<t>  2 = 

# 3 = 

Using  index  notation  this  can  he  written  as 


90  i , 

902  , 

~ 

<7Xi 

903  , 

- — dxi 

C/X^ 


90i 

dX2 

d<t>2 

0X2 

90 3 

9X2 


rfx2 

rfx2 

<ir2 


9^1  , 
-^—dx3 
ox  3 

902 

-^—dxs 

OX  3 

903 

adx3 
ox  3 


dcfi  = 

9xj 

77te  second-rank  gradient  tensor  G can  6e  represented  in  the  matrix  form  as 


G = 


dxi 

90 2 

1 

< fh 

dxi 


901 

901 

dX2 

dX3 

90a 

90a 

dx2 

dx3 

903 

903 

dX2 

0X3 

Then  the  vector  (p  can  be  expressed  compactly  as  the  inner  product  of  G and  x,  that  is 


dip  = G -dx 


E.3  Tensor  properties 

In  principle  one  must  distinguish  between  a 3 x 3 square  matrix,  and  the  tensor  component  representations  of 
a rank-2  tensor.  However,  as  illustrated  by  the  previous  discussion,  for  orthogonal  transformations,  the  tensor 
components  of  the  second  rank  tensor  transform  identically  with  the  matrix  components.  Thus  functionally, 
the  matrix  formulation  and  tensor  representations  are  identical.  As  a consequence,  all  the  terminology  and 
operations  used  in  matrix  mechanics  are  equally  applicable  to  the  tensor  representation. 

The  tensor  representation  of  the  rotation  matrix  provides  the  simplest  example  of  the  equivalence  of 
the  matrix  and  tensor  representations  of  transformations.  Appendix  D.2  showed  that  the  unitary  rotation 
matrix  A,  acting  on  a vector  x transforms  it  to  the  vector  x'  that  is  rotated  with  respect  to  x.  That  is,  the 
transformation  is 

x'  = A ■ x (D5) 

where 

( x i \ / xi  \ l e'l-e-i  e're2  e're3  \ 

x'  =1  x'2  I x = \ X2  I A = e'2ei  e'2e2  e2e3  (T>6) 

\ x'3  ) \ x3  ) \ e3  gi  e3  g2  e'3-e3  ) 

Appendix  D.2  showed  that  the  rotation  matrix  A requires  9 components  to  fully  specify  the  transformation 
from  the  initial  3-component  vector  x to  the  rotated  vector  x'.  The  rotation  tensor  is  a dyad  as  well  as  being 
unitary  and  dimensionless.  Note  that  equation  D 5 is  an  example  of  the  inner  product  of  a rank— 2 rotation 
tensor  acting  on  a vector  leading  to  a another  vector  that  is  rotated  with  respect  to  the  first  vector. 

In  general,  rank-2  tensors  have  dimensions  and  are  not  unitary.  For  example,  the  angular  velocity  vector 
u and  the  angular  momentum  vector  L are  related  by  the  inner  product  of  the  inertia  tensor  {1}  and  u>. 
That  is 

L ={1}  • uj  (11.6) 

The  inertia  tensor  has  dimensions  of  mass  x length 2 and  relates  two  very  different  vector  observables.  The 
stress  tensor  and  the  strain  tensor,  discussed  in  chapter  15,  provide  another  example  of  second-order  tensors 
that  are  used  to  transform  one  vector  observable  to  another  vector  observable  analogous  to  the  case  of  the 
rotation  matrix  or  the  inertia  tensor. 

Note  that  pseudo-tensors  can  be  used  to  make  a rotational  transformation  plus  a change  in  the  sign. 
That  is,  they  lead  to  a parity  inversion. 

The  tensor  notation  is  used  extensively  in  physics  since  it  provides  a powerful,  elegant,  and  compact 
representation  for  describing  transformations. 
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E.4  Contravariant  and  covariant  tensors 


In  general  the  configuration  space  used  to  specify  a dynamical  system  is  not  a Euclidean  space  in  that 
there  may  not  be  a system  of  coordinates  for  which  the  distance  between  any  two  neighboring  points  can 
be  represented  by  the  sum  of  the  squares  of  the  coordinate  differentials.  For  example,  a set  of  cartesian 
coordinate  does  not  exist  for  the  two-dimension  motion  of  a single  particle  constrained  to  the  curved  surface 
of  a fixed  sphere.  Such  curved  spaces  need  to  be  represented  in  terms  of  Riemannian  geometry  rather 
than  Euclidean  geometry.  Curved  configuration  spaces  occur  in  some  branches  of  physics  such  as  Einstein’s 
General  Theory  of  Relativity. 

Tensors  have  transformation  properties  that  can  be  either  contravariant  or  covariant.  Consider  a set  of 
generalized  coordinates  q'  that  are  a function  of  the  coordinates  q.  Then  infinitessimal  changes  dqm  will  lead 
to  infinitessimal  changes  dq'n  where 

f)n'n 

(E12) 

m y 

Contravariant  components  of  a tensor  transform  according  to  the  relation 

v”  = ESa”  (E-13) 

m y 

Equation  E 13  relates  the  contravariant  components  in  the  unprimed  and  primed  frames. 

Derivatives  of  a scalar  function  <j>,  such  as 

n dqn  dqm  dqn  ^ dqn  v ' 

m m 

That  is,  covariant  components  of  the  tensor  transform  according  to  the  relation 

<e^> 

m y 

It  is  important  to  differentiate  between  contravariant  and  covariant  vectors.  The  Einstein  superscript /subscript 
convention  for  distinguishing  between  these  two  flavours  of  tensors  is  given  in  table  El 


Table  E.  1.  Einstein  notation  for  tensors. 


denotes  a contravariant  vector 

xv 

denotes  a covariant  vector 

In  linear  algebra  one  can  map  from  one  coordinate  system  to  another  as  illustrated  in  appendix  D.  That 
is,  the  tensor  x can  be  expressed  as  components  with  respect  to  either  the  unprimed  or  primed  coordinate 
frames 

x = e/a;/  + e/a:/  + e/a;/  = fqaq  + e2a;2  + e3a;3  (E.16) 

For  a n— dimensional  manifold  the  unit  basis  column  vectors  e transform  according  to  the  transformation 
matrix  A 

e'  = A • e (E.17) 

Since  the  tensor  x is  independent  of  the  coordinate  basis,  the  components  of  x must  have  the  opposite 
transform 

x'  =(A'1)T-x  (E.18) 

This  normal  vector  x is  called  a ’contravariant  vector"  because  it  transforms  contrary  to  the  basis  column 
vector  transformation. 

The  inverse  of  equation  E.18  gives  that  the  column  vector  element 


x n = 


/ 

V 


V 


(E.19) 
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Consider  the  case  of  a gradient  with  respect  to  the  coordinate  x in  both  the  unprimed  and  primed  bases. 
Using  the  chain  rule  for  the  partial  derivative  then  the  component  of  the  gradient  in  the  primed  frame  can 
be  expanded  as 


(V/) 


/ 


df_ 

dr 


That  is,  the  gradient  transforms  as 


V"  8f  dxv  df  A s x df 

^ dxv  dx'  ~ ^ dxv ^ V ~ 

1/  hL  1/  ^ 

V7  = A • V/ 


(E.20) 

(E.21) 


That  is,  a gradient  transforms  as  a covariant  vector,  like  the  unit  vectors,  whereas  a vector  x is  contravariant 
under  transformation. 

Normally  the  basis  is  orthonormal,  (A-1)  = A,  and  thus  there  is  no  difference  between  contravariant  and 

covariant  vectors.  However,  for  curved  coordinate  systems,  such  as  non-Euclidean  geometry  in  the  General 
Theory  of  Relativity,  the  covariant  and  contravariant  vectors  behave  differently. 

The  Einstein  convention  is  extended  to  apply  to  matrices  by  writing  the  elements  of  the  matrix  A as 
while  the  elements  of  the  transposed  matrix  A-1  are  written  as  Af . The  matrix  product  for  A with  a 
contravariant  vector  X is  written  as 

X^=J2A^X"  (E.22) 

V 

where  the  summation  over  v effectively  cancels  the  identical  superscript  and  subscript  v. 

Similarly  a covariant  vector,  such  as  a gradient,  is  written  as, 


(V'/)„  = V (W‘)U  (Vfl  = £ (A-%  (Vf) 

IT  V 


(E.23) 


Again  the  summation  cancels  the  v superscript  and  subscript.  The  Kronecker  delta  symbol  is  written  as 


Y KXU  = X M (E.24) 


E.5  Generalized  inner  product 

The  generalized  definition  of  an  inner  product  is 

S = YJ9^X,iYv  (E.25) 

where  g ^ is  a unitary  matrix  called  a covariant  metric.  The  covariant  metric  transforms  a contravariant  to 
a covariant  tensor.  For  example  the  matrix  element  of  a covariant  tensor  X v can  be  written  as 

xv  = Y2g^xii  (e.26) 

By  association  of  the  covariant  metric  with  either  of  the  vectors  in  the  inner  product  gives 

S = Y gllvX"  Yv  = Y XVYV  = Y X'%,  (E.27) 

ITtT  IT  IT 

Similarly  it  can  be  defined  in  terms  of  an  orthogonal  contravariant  metric  g^v  where 

S = Y2  9imvX^Yv  (E.28) 

Ills 

Then 

Xv  = Y g^X^  (E.29) 

v 

Association  of  the  contravariant  metric  with  one  of  the  vectors  in  the  inner  product  gives  the  inner 
product 

S = Y 9^X,Y„  = Y = E 

Ills  IT  II 

For  most  situations  in  this  book  the  metric  g is  diagonal  and  unitary. 


(E.30) 
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E.6  Transformation  properties  of  observables 

In  physics,  observables  can  be  represented  by  spherical  tensors  which  specify  the  angular  momentum  and 
parity  characteristics  of  the  observable,  and  the  tensor  rank  is  independent  of  the  time  dependence.  The 
transformation  properties  of  these  tensors,  coupled  with  their  time-reversal  invariance,  specify  the  funda- 
mental characteristics  of  the  observables. 

Table  E. 2 summarizes  the  transformation  properties  under  rotation,  spatial  inversion  and  time  reversal 
for  observables  encountered  in  classical  mechanics  and  electrodynamics.  Note  that  observables  can  be  scalar, 
vector,  pseudovector,  or  second-order  tensors,  under  rotation,  and  even  or  odd  under  either  space  inversion 
or  time  inversion.  For  example,  in  classical  mechanics  the  inertia  tensor  I relates  the  angular  velocity  vector 
u)  to  the  angular  momentum  vector  L by  taking  the  inner  product  L = I • u>.  In  general  I is  not  diagonal  and 
thus  the  angular  momentum  is  not  parallel  to  the  angular  velocity  o>.  A similar  example  in  electrodynamics 
is  the  dielectric  tensor  K which  relates  the  displacement  field  D to  the  electric  field  E by  D = K • E.  For 
anisotropic  crystal  media  K is  not  diagonal  leading  to  the  electric  field  vectors  E and  D not  being  parallel. 

As  discussed  in  chapter  7,  Noether’s  Theorem  states  that  symmetries  of  the  transformation  properties  lead 
to  important  conservation  laws.  The  behavior  of  classical  systems  under  rotation  relates  to  the  conservation 
of  angular  momentum,  the  behavior  under  spatial  inversion  relates  to  parity  conservation,  and  time-reversal 
invariance  relates  to  conservation  of  energy.  That  is,  conservative  forces  conserve  energy  and  are  time-reversal 
invariant. 

Table  E. 2 : Transformation  properties  of  scalar,  vector,  pseudovector,  and  tensor  observables 
under  rotation,  spatial  inversion,  and  time  reversal2 


Physical  Observable 

Rotation 
(Tensor  rank) 

Space 

inversion 

Time 

reversal 

Name 

1 ) Classical  Mechanics 

Mass  density 

^ P 

0 

Even 

Even 

Scalar 

Kinetic  energy 

p2 /2m 

0 

Even 

Even 

Scalar 

Potential  energy 

U(r ) 

0 

Even 

Even 

Scalar 

Lagrangian 

L 

0 

Even 

Even 

Scalar 

Hamiltonian 

H 

0 

Even 

Even 

Scalar 

Gravitational  potential 

<t> 

0 

Even 

Even 

Scalar 

Coordinate 

r 

1 

Odd 

Even 

Vector 

Velocity 

V 

1 

Odd 

Odd 

Vector 

Momentum 

P 

1 

Odd 

Odd 

Vector 

Angular  momentum 

L = rxp 

1 

Even 

Odd 

Pseudovector 

Force 

F 

1 

Odd 

Even 

Vector 

Torque 

N = r x F 

1 

Even 

Even 

Pseudovector 

Gravitational  field 

g 

1 

Odd 

Even 

Vector 

Inertia  tensor 

I 

2 

Even 

Even 

Tensor 

Elasticity  stress  tensor 

T ik 

2 

Even 

Even 

Tensor 

2)  Electromagnetism 

Charge  density 

P 

0 

Even 

Even 

Scalar 

Current  density 

j 

1 

Odd 

Odd 

Vector 

Electric  field 

E 

1 

Odd 

Even 

Vector 

Polarization 

P 

1 

Odd 

Even 

Vector 

Displacement 

D 

1 

Odd 

Even 

Vector 

Magnetic  B field 

B 

1 

Even 

Odd 

Pseudovector 

Magnetization 

M 

1 

Even 

Odd 

Pseudovector 

Magnetic  H field 

H 

1 

Even 

Odd 

Pseudovector 

Poynting  vector 

S = E x H 

1 

Odd 

Odd 

Vector 

Dielectric  tensor 

K 

2 

Even 

Even 

Tensor 

Maxwell  stress  tensor 

T ik 

2 

Even 

Even 

Tensor 

2Based  on  table  6.1  in  "Classical  Electrodynamics"  2nd  edition,  by  J.D.  Jackson  [?] 


Appendix  F 

Aspects  of  multivariate  calculus 


Multivariate  calculus  provides  the  framework  for  handling  systems  having  many  variables  associated  with 
each  of  several  bodies.  It  is  assumed  that  the  reader  has  studied  linear  differential  equations  plus  multivariate 
calculus  and  thus  has  been  exposed  to  the  calculus  used  in  classical  mechanics.  Chapter  5 of  this  book 
introduced  variational  calculus  which  covers  several  important  aspects  of  multivariate  calculus  such  as  Euler’s 
variational  calculus  and  Lagrange  multipliers.  This  appendix  provides  a brief  review  of  a selection  of  other 
aspects  of  multivariate  calculus  that  feature  prominently  in  classical  mechanics. 


F.l  Partial  differentiation 


The  extension  of  the  derivative  to  multivariate  calculus  involves  use  of  partial  derivatives.  The  partial 
derivative  with  respect  to  the  variable  re,;  of  a multivariate  function  f[x i, #2,...., £jv)  involves  taking  the 
normal  one- variable  derivative  with  respect  to  assuming  that  the  other  N — 1 variables  are  held  constant. 
That  is, 


df(x1,X- 2,-.Xjy) 

dxi 


= lim 
o 


f{x  1,2:2,  -Xi  (Xi  + hi) , ,.xN)  - fix  1,  £2,  XN) 


(F.l) 


where  it  will  be  assumed  that  the  function  /( x)  is  a continuously-differentiable  function  to  nth  order,  then 
all  partial  derivatives  of  that  order  or  less  are  independent  of  the  order  in  which  they  are  performed.  That 
is, 

d2f(x)  d2f(x) 


dxidxj 

The  chain  rule  for  partial  differentiation  gives  that 

df{yi,y2,-.,yN) 

dV:i 


dxjdxi 


N 


= £ 


k= 1 


dfjx)  dxk{y) 
dxk  dyj 


The  total  differential  of  a multivariate  function  f(x)  is 


N 


df  = J2  hP** 


k=  1 


DXk 


This  can  be  extended  to  higher-order  derivatives  using  the  operator  formalism 


(r\  r\  \ Tl 

dXldx~  + + dXN~dx^  ) = y ^dxh...dx 


dnfjx) 
Jn  dxj1...dxj 


(F.2) 


(F.3) 


(F.4) 


(F.5) 


F.2  Linear  operators 

The  linear  operator  notation  provides  a powerful,  elegant,  and  compact  way  to  express,  and  apply,  the 
equations  of  multivariate  calculus;  it  is  used  extensively  in  mathematics  and  physics.  The  linear  operators 
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typically  comprise  partial  derivatives  that  act  on  scalar,  vector,  or  tensor  fields.  Table  FI  lists  a few 
elementary  examples  of  the  use  of  linear  operators  in  this  textbook.  The  first  four  linear  operators  involve 
the  widely  used  del  operator  V to  generate  the  gradient,  divergence  and  curl  as  described  in  appendices  G 
and  H.  The  fifth  and  sixth  linear  operators  act  on  the  Lagrangian  in  Lagrangian  mechanics  applications. 
The  final  two  linear  operators  act  on  the  wavefunction  for  wave  mechanics. 


Name 

Partial  derivative 

Field 

Action 

Gradient 

<1 

III 

<s>> 

Qjlry, 

+ 

31® 

+ 

SI® 

Scalar  potential  V 

E = W 

Divergence 

<1 

III 

<s>> 

+ 

^o> 

«l® 

+ 

sc> 

Vector  field  E 

V E 

Curl 

Vector  field  E 

V x E 

Laplacian 

v2  = v-v^  + ^ + ^ 

Scalar  potential  V 

V2V 

Euler-Lagrange 

A — d cf  d 

c?  dt  dqj  dq-j 

Scalar  Lagrangian  L 

AL  = 0 

Canonical  momentum 

^3 

Ill 

Scalar  Lagrangian  L 

III 

s? 

Canonical  momentum 

III 

Wavefunction  T 

t~i-^ 

hi 

s? 

Hamiltonian 

H = ih& 

Wavefunction  T 

H'L  = ih^r  = E^ 

Table  F.  1,  examples  of  linear  operators  used  in  this  textbook. 

There  are  three  ways  of  expressing  operations  such  as  addition,  multiplication,  transposition  or  inversion 
of  operations  that  are  completely  equivalent  because  they  all  are  based  on  the  same  principles  of  linear 
algebra.  For  example,  a transformation  O acting  on  a vector  A can  produced  the  vector  B.  The  simplest 
way  to  express  this  transformation  is  in  terms  of  components 

3 

Bt=Y.°SA:,  (F.6) 

3 = 1 

Another  way  is  to  use  matrix  mechanics  where  the  3x3  matrix  (O)  transforms  the  column  vector  (A)  to 
the  column  vector  (B),  that  is, 

(B)  = (O)  (A)  (F.7) 

The  third  approach  is  to  assume  an  operator  O acts  on  the  vector  A 

B = OA  (F.8) 

In  classical  mechanics,  and  quantum  mechanics,  these  three  equivalent  approaches  are  used  and  exploited 
extensively  and  interchangeably.  In  particular  the  rules  of  matrix  manipulation,  that  are  given  in  appendix 
A,  are  synonymous,  and  equivalent  to,  those  that  apply  for  operator  manipulation.  If  the  operator  is  complex 
then  the  operator  properties  are  summarized  as  follows. 

The  generalization  of  the  transpose  for  complex  operators  is  the  Hermitian  conjugate  O' 

Olj  = OU  (F.9) 

Note  also  that 

= (o*)T  = (oTy  (f.io) 

The  generalization  of  a symmetric  matrix  is  Hermitian,  that  is,  O is  equal  to  its  Hermitian  conjugate 

Ojj  - 0*t  = (),j  (F.ll) 

For  a real  matrix  the  complex  conjugation  has  no  effect  so  the  matrix  is  real  and  symmetric. 

The  generalization  of  orthogonal  is  unitary  for  which  the  operator  is  unitary  if  it  is  non-singular  and 

0~1=Cfl  (F.12) 


which  implies 


= U = OfO 


(F.13) 
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F.3  Transformation  Jacobian 


The  Jacobian  determinant,  which  is  usually  called  the  Jacobian,  is  used  extensively  in  mechanics  for  both 
rotational  and  translational  coordinate  transformations.  The  Jacobian  determinant  is  defined  as  being  the 
ratio  of  the  n-dimensional  volume  element  dx\dx2---dxn  in  one  coordinate  system,  to  the  volume  element 
dy\dy-2---dyn  in  the  second  coordinate  system.  That  is 


J(yiy2-yn) 


dx\dx2---dxn 

dy1dy2...dyn 


dxi 

dx± 

dx\ 

dyi 

dy2 

dyn 

0X2 

dx2 

dx2 

dyi 

dyi 

dyn 

dxn 

dxn 

dxn 

dyi 

dy2 

dyn 

(F.14) 


F.3.1  Transformation  of  integrals: 

Consider  a coordinate  transformation  for  the  integral  of  the  function  f(x i,X2,-xn)  to  the  integral  of  a 
function  g(yi,  y2,  ■■■yn)  where  y*  = h (aq,  X2,  —xn) . The  coordinate  transformation  of  the  integral  equation 
can  be  expressed  in  terms  of  the  Jacobian  J{yiy2—yn) 


J f(xi,x2,..xn)dxidx2---dxn 

f ,,  dx1dx2---dxr] 

/ f{x l,X2,:Xn)- 


dy1dy2...dy, 


-dyidy2—dy„ 


J g(yi,V2,  -yn)dyidy2...dyn  = 


(F.15) 


I f(yi,V2,  ~yn)J(yi,y2,  -yn)dyidy2...dyn 


F.3. 2 Transformation  of  differential  equations: 

The  differential  cross  sections  for  scattering  can  be  defined  either  by  the  number  of  a definite  kind  of 
particle/per  event,  going  into  the  volume  element  in  momentum  space  dp\dp2dp3,  or  by  the  number  going 
into  the  solid  angle  element  having  momentum  between  p and  p + dp.  That  is,  the  first  definition  can  be 
written  as  a differential  equation 


d3S(p1,p2,P3 ) 

dpidp2dp3 


dpidp2dp3  = 


93 S{p1(p04>),p2{p0(t))1p3(jp04>))  d(pi,p2,P3) 
dpidp2dp3  d(p,6,4>) 


dpdddcj) 


(F.16) 


As  shown  in  table  (7.4,  dp\dp2dp3  = p2  sin ddpdddcj),  that  is,  the  Jacobian  equals  p2  sin 0.  Thus  equation  T.16 
can  be  written  as 

d3S(pi,P2,P3 ) ...  r d3s 


dpidp2dp3  = 

op\ op2up3 

The  differential  cross  section  is  defined  by 


dpidp2dp3 


P 


(sin  Odpdddf)  = ^ ® dpdSl 

opo\  l 


d2cr(p,  9,  (p)  = d3S  2 
dpdQ  dp\dp2dp3P 

where  the  p2  factor  is  absorbed  into  the  cross  section  and  the  solid  angle  term  is  factored  out 


(F.17) 


(F.18) 


F.3. 3 Properties  of  the  Jacobian: 

In  classical  mechanics  the  Jacobian  often  is  extended  from  3 dimensions  to  n-dimensional  transformations. 
The  Jacobian  is  unity  for  unitary  transformations  such  as  rotations  and  linear  translations  which  implies  that 
the  volume  element  is  preserved.  It  will  be  shown  that  this  also  is  true  for  a certain  class  of  transformations 
in  classical  mechanics  that  are  called  canonical  transformations.  The  Jacobian  transforms  the  local  density 
to  be  correct  for  any  scale  transformations  such  as  transforming  linear  dimensions  from  centimeters  to  inches. 

F.l  Example:  Jacobian  for  transform  from  cartesian  to  spherical  coordinates 

Consider  the  transform  in  the  three-dimensional  integral  f f(x\,X2,X3)dx\dx2dx3  under  transformation 
from  cartesian  coordinates  {x\,X2,X3)  to  spherical  coordinates  (r,0,<j>).  The  transformation  is  governed  by 
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the  geometric  relations  x\  = r sin  0 cos  X2  = rsin0sin(/>,  x%  = rcosd.  For  this  transformation  the  Jacobian 
determinant  equals 


sin0cos</>  r cos  9 cos  (j)  —r  sin  0 sin  <f> 
sin  9 sin  cj>  r cos  9 sin  <f>  r sin  9 cos  4> 
cos  9 — rsinf?  0 


= r sm  f 


J(r,9,  (j!)  = 

Thus  the  three-dimensional  volume  integral  transforms  to 

J f(xi,x2,xz)dxidx2dx3  = j f{r,9,(j>)J(r,9,f))drd9d(j)  = J f(r,9,(j))r2  sin  9drd9d(f 
which  is  the  well-known  volume  integral  in  spherical  coordinates. 


F.4  Legendre  transformation 

Hamiltonian  mechanics  can  be  derived  directly  from  Lagrange  mechanics  by  considering  the  Legendre  trans- 
formation between  the  conjugate  variables  (q,  q,  t)  and  (q,  p,  t) . Such  a derivation  is  of  considerable  im- 
portance in  that  it  shows  that  Hamiltonian  mechanics  is  based  on  the  same  variational  principles  as  those 
used  to  derive  Lagrangian  mechanics;  that  is  d’Alembert’s  Principle  or  Hamilton’s  Principle.  The  general 
problem  of  converting  Lagrange’s  equations  into  the  Hamiltonian  form  hinges  on  the  inversion  of  equation 
(8.3)  that  defines  the  generalized  momentum  p.  This  inversion  is  simplified  by  the  fact  that  (8.3)  is  the  first 
partial  derivative  of  the  Lagrangian  L( q,  q,  t)  which  is  a scalar  function. 

Consider  transformations  between  two  functions  F(u,  w)  and  G(v,w)  where  u and  v are  the  active 
variables  related  by  the  functional  form 

v = VuF(u,  w)  (F.19) 

and  where  w designates  passive  variables  and  Vu.F(u,  w)  is  the  first-order  derivative  of  F(u,  w)  , i.e.  the 
gradient,  with  respect  to  the  components  of  the  vector  u.  The  Legendre  transform  states  that  the  inverse 
formula  can  always  be  written  in  the  form 

u = VvG(v,  w)  (F.20) 

where  the  function  G(v,w)  is  related  to  F(u,  w)  by  the  symmetric  relation 

G(v,  w)  + F(u,  w)  = u • v (F.21) 

and  where  the  scalar  product  u • v = YliLi  uivi- 

Furthermore  the  derivatives  with  respect  to  all  the  passive  variables  {to,}  are  related  by 

VwF(u,  w)  = - VwG(v,  w)  (F.22) 

The  relationship  between  the  functions  F( u,  w)  and  G(v,  w)  is  symmetrical  and  each  is  said  to  be  the 
Legendre  transform  of  the  other. 


Workshop  exercises 


1.  Below  you  will  find  a set  of  integrals.  Your  teaching  assistant  will  divide  you  into  groups  and  each  group  will 
be  assigned  one  integral  to  work  on.  Once  your  group  has  solved  the  integral,  write  the  solution  on  the  board 
in  the  space  provided  by  the  teaching  assistant. 


(a)  fo*fo/4fo°S  r-2 sin 9drd9dcf) 

(b)  f(rr~rj)dt 

(c)  fs  A ■ da  where  A = xi  + yj  + zk  and  S is  the  sphere  x2  + y2  + z2  = 9. 

(d)  Js (' V X A)  • da  where  A = yi  + zj+  a’k  and  S is  the  surface  defined  by  the  paraboloid  z = 1 — x2  —y2, 


where  z > 0. 


Appendix  G 

Vector  differential  calculus 


This  appendix  reviews  vector  differential  calculus  which  is  used  extensively  in  both  classical  mechanics  and 
electromagnetism. 


G.l  Scalar  differential  operators 


G.1.1  Scalar  field 

Differential  operators  like  time  (4-)  do  not  change  the  rotational  properties  of  scalars  or  proper  vectors.  A 
scalar  operator  ^ acting  on  a scalar  field  (j>{xyz ),  in  a rotated  coordinated  frame  (f)  (x'y'z')  is  unchanged. 


d^__d4_ 

ds  ds 


(G.l) 


G.l. 2 Vector  field 


Similarly  for  a proper  vector  field 


\ ' cL4y 
-di  = 2^Xi^ 


(G.2) 


j 

That  is,  differentiation  of  scalar  or  vector  fields  with  respect  to  a scalar  operator  does  not  change  the 
rotational  behavior.  In  particular,  the  scalar  differentials  of  vectors  continue  to  obey  the  rules  of  ordinary 
proper  vectors.  The  scalar  operator  is  used  for  calculation  of  velocity  or  acceleration. 


G.2  Vector  differential  operators  in  cartesian  coordinates 

Vector  differential  operators,  such  as  the  gradient  operator,  are  important  in  physics.  The  action  of  vector 
operators  differ  along  different  orthogonal  axes. 

G.2.1  Scalar  field 

Consider  a continuous,  single-valued  scalar  function  <j)(xi,Xj,Xk)-  Since 

4>'  = <t>  (G.3) 


then  the  partial  differential  with  respect  to  one  component  X;  of  the  vector  x7  gives 


The  inverse  rotation  gives  that 


dcj)' 


Ed(j>  dxj 

dxn  dx', 
j J 1 


kjxk 


(G.4) 


(G.5) 
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Therefore 


Thus 


1 k 1 k 


13 


d</>'  _ v-^  , dq b_ 

j — 


dx\ 


dXn 


That  is  the  vector  derivative  acting  of  a scalar  field  transforms  like  a proper  vector. 

Define  the  gradient,  or  V operator,  as 

v -Y,e;'dXl 

i 

where  ei  is  the  unit  vector  along  the  Xi  axis.  In  cartesian  coordinates,  the  del  vector  operator  is, 

_ -7-  d ~d  - d 

V = 1Q h j o 1- 

dx  dy  dz 


(G.6) 

(G.7) 

(G.8) 

(G.9) 


The  gradient  was  applied  to  the  gravitational  and  electrostatic  potential  to  derive  the  corresponding  field. 
For  example,  for  electrostatics  it  was  shown  that  the  gradient  of  the  scalar  electrostatic  potential  field  V can 
be  written  in  cartesian  coordinates  as 

E = -VF  (G.10) 

Note  that  the  gradient  of  a scalar  field  produces  a vector  field.  You  are  familiar  with  this  if  you  are  a skier 
in  that  the  gravitational  force  pulls  you  down  the  line  of  steepest  descent  for  the  ski  slope. 


G.2.2  Vector  field 


Another  possible  operation  for  the  del  operator  is  the  scalar  product  with  a vector.  Using  the  definition  of 
a scalar  product  in  cartesian  coordinates  gives 


V ■ A = 


'dAx 

dx 


J • J 


>dAy 

dy 


k k 


8AZ 

dz 


dAx  dAy  dAz 
dx  By  dz 


(G.ll) 


This  scalar  derivative  of  a vector  field  is  called  the  divergence.  Note  that  the  scalar  product  produces  a 
scalar  field  which  is  invariant  to  rotation  of  the  coordinate  axes. 

The  vector  product  of  the  del  operator  with  another  vector,  is  called  the  curl  which  is  used  extensively 
in  physics.  It  can  be  written  in  the  determinant  form 


V x A = 


i 


_d_ 

dx 


j 

_d_ 

dy 


k 


_d_ 

dz 


Az 


(G.12) 


By  contrast  to  the  scalar  product,  both  the  gradient  of  a scalar  field,  and  the  vector  product,  are  vector 
fields  for  which  the  components  along  the  coordinate  axes  transform  in  a specific  manner,  such  as  to  keep  the 
length  of  the  vector  constant,  as  the  coordinate  frame  is  rotated.  The  gradient,  scalar  and  vector  products 
with  the  V operator  are  the  first  order  derivatives  of  fields  that  occur  most  frequently  in  physics. 

Second  derivatives  of  fields  also  are  used.  Let  us  consider  some  possible  combinations  of  the  product  of 
two  del  operators. 

1)  V-(VU)  = V2U 

The  scalar  product  of  two  del  operators  is  a scalar  under  rotation.  Evaluating  the  scalar  product  in 
cartesian  coordinates  gives 


Ud_  ~d_  ~d_\  (~dV_  ~dV_  _ d?V_  (PV_  (PV_ 

/ dx  + J dy  + dz)  \ dx  + J dy  + dz)  dx 2 + dy2  + dz 2 


(G.13) 


This  also  can  be  obtained  without  confusion  by  writing  this  product  as; 

V-  (W)  = V ■ VU  = (V  • V)  V 


(G.14) 


G.3.  VECTOR  DIFFERENTIAL  OPERATORS  IN  CURVILINEAR  COORDINATES 
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where  the  scalar  product  of  the  del  operator  is  a scalar,  called  the  Laplacian  V",  given  by 


The  Laplacian  operator 


V ■ V = V2  = 


d2  d 2 

dx2  dy2 

is  encountered  frequently  in  physics. 


cP_ 

dz2 


2)  Vx  (VI/)  = 0 

Note  that  the  vector  product  of  two  identical  vectors 


(G.15) 


A x A = 0 (G.16) 

Therefore 

V x (W)  — 0 (G.17) 

This  can  be  confirmed  by  evaluating  the  separate  components  along  each  axis. 

3)  V-  (V  x A)  = 0 

This  is  zero  because  the  cross-product  is  perpendicular  to  V x A and  thus  the  dot  product  is  zero. 

4)  Vx  (V  x A)  = V-  ( V ■ A)  — V2A 
The  identity 

A x (B  x C)  = B (A  • C)  — (A  ■ B)  C (G.18) 

can  be  used  to  give 

Vx  (V  x A)  = V-  (V  ■ A)  - V2A  (G.19) 

since  V -V  = V2. 

There  are  pitfalls  in  the  discussion  of  second  derivatives  in  that  it  is  assumed  that  both  del  operators 
operate  on  the  same  variable,  otherwise  the  results  are  different. 


G.3  Vector  differential  operators  in  curvilinear  coordinates 

As  discussed  in  Appendix  C there  are  many  situations  where  the  symmetries  make  it  more  convenient  to  use 
orthogonal  curvilinear  coordinate  systems  rather  than  cartesian  coordinates.  Thus  it  is  necessary  to  extend 
vector  derivatives  from  cartesian  to  curvilinear  coordinates.  Table  C.  1 can  be  used  for  expressing  vector 
derivatives  in  curvilinear  coordinate  systems. 

G.3.1  Gradient: 

The  gradient  in  curvilinear  coordinates  is 


„ , 1 df  ^ 

V/  = T-^-qi 
hi  dqi 


1 df  „ 1 <9/  „ 

q — fib  + t — <33 

h2  oq-2  h3  dq3 


where  the  coefficients  hi  are  listed  in  table  C.l. 
For  cylindrical  coordinates  this  becomes 


V/  = 


df  „ 1 df  . df  „ 

op  p op  oz 


In  spherical  coordinates 


V/ 


f' 


r d6 


1 df  „ 

(O 

r sin  9 d^p 


(G.20) 


(G.21) 


(G.22) 
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G.3.2  Divergence: 

The  divergence  can  be  expressed  as 

V ■ A 1 


hih2h3 

In  cylindrical  coordinates  the  divergence  is 


O A A 

77 (A1/I2/I3)  + 77 {A2h3hi)  + 77 (A3/I1/I2) 

o<7i  o<?2  dr/3 


1 a 


V-A  =-^-(pAp)  , 

pop  P Op 

In  spherical  coordinates  the  divergence  is 

1 


1 dA^  dAz 


dz 


Ap  dAp  1 dAp  dAz 

p dp  p d<p  dz 


V-  A=- 


^ (Ay~  sin  9)  + ^ (Aer  sin  9)  + A (A^r) 


G.3.3  Curl: 


V x A = 


1 


hih2h3 


hi  qi  h2  q2  /i3q[3 

d d d 

dqi  dq2  dq3 

h\A\  ii2A2  h3A3 


In  cylindrical  coordinates  the  curl  is 


V x A = - 


In  spherical  coordinates  the  curl  is 


V x A 
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r2  sin  9 
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Pp 

z 
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(G.23) 
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G.3.4  Laplacian: 

Taking  the  divergence  of  the  gradient  of  a scalar  gives 


v2/  = V- v/  = 


1 


hih2h3 


d f h2h3  df\  d f h3h\  df 


dqi  V hi  dqij  dq2  \ h2  dq2 


The  Laplacian  of  a scalar  function  / in  cylindrical  coordinates  is 


1 d 


df 


v7  ~ “tt:  dyr:  + -777-7  + ttt- 


p dp  \ dp  J p2  dip2  dz2 
The  Laplacian  of  a scalar  function  / in  spherical  coordinates  is 


1 d 


V7 


2 gr 


>df 


dr 


_L d_ 

r 2 sin  9 d6 


sint>A  I + 


d f hih2  df 
dq3  V h3  dq3 


1 d2f  , d2f 


1 d2f 
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(G.29) 


(G.30) 


(G.31) 


The  gradient,  divergence,  curl  and  Laplacian  are  used  extensively  in  curvilinear  coordinate  systems  when 
dealing  with  vector  fields  in  Newtonian  mechanics,  electromagnetism,  and  fluid  flow. 


Appendix  H 

Vector  integral  calculus 


Field  equations,  such  as  for  electromagnetic  and  gravitational  fields,  require  both  line  integrals,  and  surface 
integrals,  of  vector  fields  to  evaluate  potential,  flux  and  circulation.  These  require  use  of  the  gradient,  the 
Divergence  Theorem  and  Stokes  Theorem  which  are  discussed  in  the  following  sections. 


H.l  Line  integral  of  the  gradient  of  a scalar  field 

The  change  AH  in  a scalar  field  for  an  infinitessimal  step  d\  along  a path  can  be  written  as 

AV'  = (W)-<fl  (H.l) 

since  the  gradient  of  V,  that  is,  W,  is  the  rate  of  change  of  V with  d\.  Discussions  of  gravitational  and 
electrostatic  potential  show  that  the  line  integral  between  points  a and  b is  given  in  terms  of  the  del  operator 
by 

Vb-Va=  I (W)-dl  (H.2) 

J a 

This  relates  the  difference  in  values  of  a scalar  field  at  two  points  to  the  line  integral  of  the  dot  product  of 
the  gradient  with  the  element  of  the  line  integral. 


H.2  Divergence  theorem 

H.2.1  Flux  of  a vector  field  for  Gaussian  surface 


Consider  the  flux  $ of  a vector  field  F for  a closed  surface,  usually 
called  a Gaussian  surface,  S shown  in  figure  H.l. 

$ = £ F • dS  (H.3) 

If  the  enclosed  volume  is  cut  in  to  two  pieces  enclosed  by  surfaces 
Si  = Sa  + Sab  and  S2  = Sb  + Sab.  The  flux  through  the  surface  Sab 
common  to  both  Si  and  S2  are  equal  and  in  the  same  direction.  Then 
the  net  flux  through  the  sum  of  Si  and  S2  is  given  by 


F-dS  + 


Si 


F • dS  = <b  F • dS 

So  Js 


(H.4) 


since  the  contributions  of  the  common  surface  Sab  cancel  in  that  the 
flux  out  of  Si  is  equal  and  opposite  to  the  flux  into  S2  over  the  surface 
Sab-  That  is,  independent  of  how  many  times  the  volume  enclosed  by 
S is  subdivided,  the  net  flux  for  the  sum  of  all  the  Gaussian  surfaces 
enclosing  these  subdivisions  of  the  volume,  still  equals  <fs  F • d S. 
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Figure  H.l:  A volume  V enclosed 
by  a closed  surface  S is  cut  into  two 
pieces  at  the  surface  S ab.  This  gives 
Vi  enclosed  by  Si  and  Vi  enclosed 
by  S2. 
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Consider  that  the  volume  enclosed  by  S is  subdivided  into  N subdivisions  where  N — > oo,  then  even 
though  <j>s  F • dS  — ► 0 as  N — > oo,  the  sum  over  surfaces  of  all  the  infinitessimal  volumes  remains  unchanged 

r.  N—>00  r. 

<I>  = j)F-dS=  j F-dS  (H.5) 

Thus  we  can  take  the  limit  of  a sum  of  an  infinite  number  of  infinitessimal  volumes  as  is  needed  to  obtain  a 
differential  form.  The  surface  integral  for  each  infinitessimal  volume  will  equal  zero  which  is  not  useful,  that 
is  j>s  F • dS  — > 0 as  N — > oo.  However,  the  flux  per  unit  volume  has  a finite  value  as  N — > oo.  This  ratio  is 
called  the  divergence  of  the  vector  field; 


div  F = LirriATi^o- 


• F • dS 

Ar,; 


(H.6) 


where  At*  is  the  infinitessimal  volume  enclosed  by  surface  Si.  The  divergence  of  the  vector  field  is  a scalar 
quantity. 

Thus  the  sum  of  flux  over  all  infinitessimal  subdivisions  of  the  volume  enclosed  by  a closed  surface  S 
equals 


N- 


$ = ® F • riS  = ^ 


F • dS 


N- 


IS  — - A Ti 

In  the  limit  N — > oo,  At,;  — ► 0,  this  becomes  the  integral; 


A n = Y2  divFAn 


$ = ® F • dS  = / divFdr 

J g J Enclosed 

volume 


(H.7) 


(H.8) 


This  is  called  the  Divergence  Theorem  or  Gauss’s  Theorem.  To  avoid  confusion  with  Gauss’s  law  in  electro- 
statics, it  will  be  referred  to  as  the  Divergence  theorem. 


H.2.2  Divergence  in  cartesian  coordinates. 

Consider  the  special  case  of  an  infinitessimal  rectangular  box,  size 
Ax,  Ay,  Az  shown  in  figure  H. 2.  Consider  the  net  flux  for  the  z com- 
ponent Fz  entering  the  surface  Ax  Ay  at  location  xyz. 


= f. 


Ax  d Fz  Ay  dFz 


AxAy 


(H.9) 


A$o«t  = I F + Az 


2 dx  2 dy 

The  net  flux  of  the  z component  out  of  the  surface  at  z + Az  is 

dFz  , A x3Fz  , AyBFz\ 

~dz  + —~dx  + —~dy)  A Ay  <H-10) 

Thus  the  net  flux  out  of  the  box  due  to  the  z component  of  F is 

BF 

AL>Z  = A$°ut  - AV"1  = AxAyAz 

Adding  the  similar  x and  y components  for  A<f>  gives 

(BFX  BFV  BFZ\  . A 
A$  = — ^ + —dL  + — * AxAyAz 

\ ox  By  Bz  J 

This  gives  that  the  divergence  of  the  vector  field  F is 

F • dS 


div  F = LimATi^o- 


A Ti 


(H.ll) 

Figure  H.2:  Computation  of  flux 
out  of  an  infinitessimal  rectangular 
(H.12)  b°A  Ax,  Ay,  Az. 


B Fx  BF, 


dx 


By 


Bz 


(H.13) 
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since  At  = AxAyAz.  But  the  right  hand  side  of  the  equation  equals  the  scalar  product  V • F,  that  is, 

div F = V F (H.14) 

The  divergence  is  a scalar  quantity.  The  physical  meaning  of  the  divergence  is  that  it  gives  the  net  flux  per 
unit  volume  flowing  out  of  an  infinitessimal  volume.  A positive  divergence  corresponds  to  a net  outflow  of 
flux  from  the  infinitessimal  volume  at  any  location  while  a negative  divergence  implies  a net  inflow  of  flux 
to  this  infinitessimal  volume. 

It  was  shown  that  for  an  infinitessimal  rectangular  box 

A$  = ( AxAyAz  = V ■ FAr  (H.15) 

\ ox  ay  oz  ) 

Integrating  over  the  finite  volume  enclosed  by  the  surface  S gives 

<S>  = (f  F • dS  = [ V • F dr  (H.16) 

J g J Enclosed 

volume 

This  is  another  way  of  expressing  the  Divergence  theorem 


$ = d>  F • dS  = 


divFdr 


I Enclosed 
volume 


(H.17) 


The  divergence  theorem,  developed  by  Gauss,  is  of  considerable  importance,  it  relates  the  surface  integral  of 
a vector  field,  that  is,  the  outgoing  flux,  to  a volume  integral  of  V • F over  the  enclosed  volume. 


H.l  Example:  Maxwell’s  Flux  Equations 


As  an  example  of  the  usefulness  of  this  relation,  consider  the  Gauss’s  law  for  the  flux  in  Maxwell’s 
equations. 

Gauss  ’ Law  for  the  electric  field 


$E=  l E ■ dS  =— 

J Closed 
surface 

Bid  the  divergence  relation  gives  that 

$E  = <b  E • dS  = 


So  J enclosed 
volume 


pdr 


lEds=L 


V - E dr 


Enclosed 

volume 


Combining  these  gives 


Closed 

surface 


E-dS  = 


V • E dr  = — 


I Enclosed 
volume 


J enclosed 
volume 


pdr 


This  is  true  independent  of  the  shape  of  the  surface  or  enclosed  volume,  leading  to  the  differential  form 
of  Maxwell’s  first  law,  that  is  Gauss’s  law  for  the  electric  field. 


V • E = — 

so 

The  differential  form  of  Gauss’s  law  relates  V ■ E to  the  charge  density  p at  that  same  location.  This  is 
much  easier  to  evaluate  than  a surface  and  volume  integral  required  using  the  integral  form  of  Gauss ’s  law. 
Gauss ’s  law  for  magnetism 


= 


Using  the  divergence  theorem  gives  that 


<1>h  = 


Closed 

surface 


Closed 

surface 


B-dS  = 0 


B-dS  = 


J t 


V ■ Bdr  = 0 


Enclosed 

volume 
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This  is  true  independent  of  the  shape  of  the  Gaussian  surface  leading  to  the  differential  form  of  Gauss ’s  law 
for  B 

V B = 0 

That  is,  the  local  value  of  the  divergence  of  B is  zero  everywhere. 

H.2  Example:  Buoyancy  forces  in  fluids 

Buoyancy  in  fluids  provides  an  example  of  the  use  of  flux  in  physics.  Consider  a fluid  of  density  p{z) 
in  a gravitational  field  g{z)  = — g(z)z  where  the  z axis  points  in  the  opposite  direction  to  the  gravitational 
force.  Pressure  equals  force  per  unit  area  and  is  a scalar  quantity.  For  a conservative  fluid  system,  in  static 
equilibrium,  the  net  work  done  per  unit  area  for  an  infinitessimal  displacement  dr  is  zero.  The  net  pressure 
force  per  unit  area  is  the  difference  P(r  + dr)  — P(r)  = VP-dr  while  the  net  change  in  gravitational  potential 
energy  is  p(z)g(z ) • dr.  Thus  energy  conservation  gives 

[VP  + p(z) g(z)]  • dr  =0 


which  can  be  expanded  as 


dP 

dz 

dP 

dx 


~r  = ~p(z)g(z) 


dy 


(A) 


Integrating  the  net  forces  normal  to  the  surface  over  any  closed  surface  enclosing  an  empty  volume,  inside 
the  fluid,  gives  a net  buoyancy  force  on  this  volume  that  simplifies  using  the  Divergence  theorem 


F • dS=j)  PdS  ■ dS 


T)  ]a  . . dP  dP  dP\  , 

PdS  = I — I — 1-  — ) dr 

Enclosed  \ dx  dy  dz  J 


Using  equations  A leads  to  the  net  buoyancy  force 


F • rIS= 


Enclosed  dz 
vol 


-v-dr  = 


vol 


L 


Enclosed 
vol 


p{z)g{z)dx 


The  right  hand  side  of  this  equation  equals  minus  the  weight  of  the  displaced  fluid.  That  is,  the  buoyancy  force 
equals  the  weight  of  the  fluid  displaced  by  the  empty  volume.  Note  that  this  proof  applies  both  to  compressible 
fluids,  where  the  density  depends  on  pressure,  as  well  as  to  incompressible  fluids  where  the  density  is  constant. 
It  also  applies  to  situations  where  local  gravity  g is  position  dependent.  If  an  object  of  mass  M is  completely 
submerged  then  the  net  force  on  the  object  is  Mg  — j Enclosed  p(z)g(z) dr . If  the  object  floats  on  the  surface 

vol 

of  a fluid  then  the  buoyancy  force  must  be  calculated  separately  for  the  volume  under  the  fluid  surface  and 
the  upper  volume  above  the  fluid  surface.  The  buoyancy  due  to  displaced  air  usually  is  negligible  since  the 
density  of  air  is  about  10-3  times  that  of  fluids  such  as  water. 

H.3  Stokes  Theorem 


H.3.1  The  curl 

Maxwell’s  laws  relate  the  circulation  of  the  field  around  a closed  loop  to  the  rate  of  change  of  flux  through 
the  surface  bounded  by  the  closed  loop.  It  is  possible  to  write  these  integral  equations  in  a differential  form 
as  follows. 

Consider  the  line  integral  around  a closed  loop  C shown  in  figure  H.3. 

If  this  area  is  subdivided  into  two  areas  enclosed  by  loops  C\  and  Ci,  then  the  sum  of  the  line  integrals 
is  the  same 

/ F • dl  = l F • dl  + <j>  F • dl  (H.18) 

J C J Cl  J C“2 

because  the  contributions  along  the  common  boundary  cancel  since  they  are  taken  in  opposite  directions  if 
C i and  C2  both  are  taken  in  the  same  direction.  Note  that  the  line  integral,  and  corresponding  enclosed  area, 
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are  vector  quantities  related  by  the  right-hand  rule  and  this  must  be  taken  into  account  when  subdividing 
the  area.  Thus  the  area  can  be  subdivided  into  an  infinite  number  of  pieces  for  which 


F • d\  = 


JV— »oo 


N 


F • rll  = 


ic, F ■ rfl 


AS, 


AS,-  • n 


(H.19) 


where  AS^  is  the  infinitessimal  area  bounded  by  the  closed  sub-loop  C*  and  AS,  • n is  the  normal  component 
of  this  area  pointing  along  the  n direction  which  is  the  direction  along  which  the  line  integral  points. 

The  component  of  the  curl  of  the  vector  function  along  the  di- 
rection n is  defined  to  be 


{curlF)  ■ n 


N — >oo 

LiniAs^o  ^2 

i 


AS i ■ n 


Thus  the  line  integral  can  be  written  as 


c 


fmi  = 


N^°°  4 F • dl 


AS,  • n 


AS,- 


= J [{curlF)  ■ n]  (IS,  ■ n 


(H.20) 


(H.21) 


The  product  n • n = 1,  that  is,  this  is  true  independent  of  the 
direction  of  the  infinitessimal  loop.  Thus  the  above  relation  leads 
to  Stokes  Theorem 


F-dl  = 


f 

I Area 
J bounded 
by 


C 


(. curlF ) ■ dS 


(H.22) 


Figure  H.3:  The  circulation  around  a 
path  is  equal  to  the  sum  of  the  circu- 
lations around  subareas  made  by  sub- 
dividing the  area. 


This  relates  the  line  integral  to  a surface  integral  over  a surface 
bounded  by  the  loop. 


H.3. 2 Curl  in  cartesian  coordinates 

Consider  the  infinitessimal  rectangle  Ax  Ay  pointing  in  the  k direction  shown  in  figure  HA. 
The  line  integral,  taken  in  a right-handed  way  around  k gives 


£Fd,= F-A* + (F» + - (F-  ^ + w 1»)  - F»- A« = (t&  - w) AxAv  (H-23) 

Thus  since  Ax  Ay  = AS0  the  z component  of  the  curl  is  given  by 


(curlF)  • k 


F ■ dl 


AS,  • n 


t£-w)  v* 


The  same  argument  for  the  component  of  the  curl  in  the  y directic 
is  given  by 

/ mr  nz?  \ 

(H.2 


, V ( dFx 

{curlF)  ■ j = ( — 


dx 


Similarly  the  same  argument  for  the  component  of  the  curl  in  the 
direction  is  given  by 


(cur/F)  • i = ( 

' oy 


dFy 

dz 


(H.2 


Fz 


Figure  H.4:  Circulation  around  an 
infinitessimal  rectangle  Ax  Ay  in  the 
z direction. 
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Thus  combining  the  three  components  of  the  curl  gives 


curlF  = 


dFz  dF» 


dFx  dFz 
dz  dx 


dy  dz 

Note  that  cross-product  of  the  del  operator  with  the  vector  F is 


j + 


fdFy  8FX 
\ dx  dy 


(H.27) 


VxF  = 


i j k 

_d_  _d_  d_ 

dx  dy  dz 

F F F 

x x x y ± z 


(H.28) 


which  is  identical  to  the  right  hand  side  of  the  relation  for  the  curl  in  cartesian  coordinates.  That  is; 

V x F = curlF  (H.29) 

Therefore  Stokes  Theorem  can  be  rewritten  as 

F-dl  = f Area  (curlF)  ■ dS  = [ Area  (V  X F)  • dS  (H.30) 


I bounded 
by 

C 


I bounded 
by 

C 


The  physics  meaning  of  the  curl  is  that  it  is  the  circulation,  or  rotation,  for  an  inhnitessimal  loop  at  any 
location.  The  word  curl  is  German  for  rotation. 

H.3  Example:  Maxwell’s  circulation  equations 

As  an  example  of  the  use  of  the  curl,  consider  Faraday’s  Law 

dB 


losed 

loop 

C 


F-d\  = - 


f surface  ' dS 

bounded  OZ 
by 

C 


Using  Stokes  Theorem  gives 


(j)  E • dl  — /Surface  X E)  • dS 
JC  J bounded 


These  two  relations  are  independent  of  the  shape  of  the  closed  loop,  thus  we  obtain  Faraday’s  Law  in  the 
differential  form 

<vxE)  = -f 

A differential  form  of  the  Ampere-Maxwell  law  also  can  be  obtained  from 

Jklosed^  ' dl  = fl0  [Bounded{i  + £o -jrr)  • dS 

J loop  J bv  Ub 

C 

Using  Stokes  Theorem 


by 

c 


B ■ dl  = Surface  (V  X B)  • dS 
C J bounded 


Again  this  is  independent  of  the  shape  of  the  loop  and  thus  we  obtain 
Ampere- Maxwell  law  in  differential  form 


VxB  = /i0j  + fi0£0 


dF 

~dt 


The  differential  forms  of  Maxwell’s  circulation  relations  are  easier  to  apply  than  the  integral  equations 
because  the  differential  form  relates  the  curl  to  the  time  derivatives  at  the  same  specific  location. 
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H.4  Potential  formulations  of  curl-free  and  divergence-free  fields 

Interesting  consequences  result  from  the  Divergence  theorem  and  Stokes  Theorem  for  vector  fields  that  are 
either  curl-free  or  divergence-free.  In  particular  two  theorems  result  from  the  second  derivatives  of  a vector 
field. 

Theorem  1;  Curl-free  (irrotational)  fields: 

For  curl-free  fields 

V x F = 0 (H.31) 

everywhere.  This  is  automatically  obeyed  if  the  vector  field  is  expressed  as  the  gradient  of  a scalar  field 

F = Vcf)  (H.32) 

since 

Vx(V0)  = O (H.33) 

That  is,  any  curl-free  vector  field  can  be  expressed  in  terms  of  the  gradient  of  a scalar  field. 

The  scalar  field  <j>  is  not  unique,  that  is,  any  constant  a can  be  added  to  </>  since  Va  = 0,  that  is,  the 
addition  of  the  constant  a does  not  change  the  gradient.  This  independence  to  addition  of  a number  to  the 
scalar  potential  is  called  a gauge  invariance  discussed  in  chapter  13.2,  for  which 

F = V(f>'  = V (0  + a)  = V0  (H.34) 

That  is,  this  gauge-invariant  transformation  does  not  change  the  observable  F.  The  electrostatic  field  E 
and  the  gravitation  field  g are  examples  of  irrotational  fields  that  can  be  expressed  as  the  gradient  of  scalar 
potentials. 

Theorem  2;  Divergence-free  (solenoidal)  fields: 

For  divergence-free  fields 


V ■ F = 0 (H.35) 

everywhere.  This  is  automatically  obeyed  if  the  field  F is  expressed  in  terms  of  the  curl  of  a vector  field  G 
such  that 

F = VxG  (H.36) 

since  V ■ V x G = 0.  That  is,  any  divergence-free  vector  field  can  be  written  as  the  curl  of  a related  vector 
field. 

As  discussed  in  chapter  13.2,  the  vector  potential  G is  not  unique  in  that  a gauge  transformation  can  be 
made  by  adding  the  gradient  of  any  scalar  field,  that  is,  the  gauge  transformation  G'  = G + gives 

F = VxG'  = Vx(G  + V¥>)  = VxG.  (H.37) 

This  gauge  invariance  for  transformation  to  the  vector  potential  G'  does  not  change  the  observable  vector 
field  F.  The  magnetic  field  B is  an  example  of  a solenoidal  field  that  can  be  expressed  in  terms  of  the  curl 
of  a vector  potential  A. 

H.4  Example:  Electromagnetic  fields: 

Electromagnetic  interactions  are  encountered  frequently  in  classical  mechanics  so  it  is  useful  to  discuss 
the  use  of  potential  formulations  of  electrodynamics. 

For  electrostatics,  Maxwell’s  equations  give  that 

VxE  = 0 

Therefore  theorem  1 states  that  it  is  possible  to  express  this  static  electric  field  as  the  gradient  of  the  scalar 
electric  potential  V,  where 


E = -VF 
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For  electrodynamics,  Maxwell’s  equations  give  that 


an 

(VxE)  + -=0 


Assume  that  the  magnetic  field  can  be  expressed  in  the  terms  of  the  vector  potential  B = V x A,  then 
the  above  equation  becomes 

F)  A 

Vx,E+-)  = ° 

Theorem  1 gives  that  this  curl-less  field  can  be  expressed  as  the  gradient  of  a scalar  field,  here  taken  to 
be  the  electric  potential  V. 

<E  + f)==-w 


that  is 


Gauss’  law  states  that 


which  can  be  rewritten  as 


E = -<w+§> 


V E = — 

£o 


V-E  = 


<9(V  ■ A) 
dt 


P_ 

£o 


(X) 


Similarly  insertion  of  the  vector  potential  A in  Ampere ’s  Law  gives 

<9E  fdV 

VxB  = Vx(Vx  A)=/z0j  + p0£o  — - = MoJ-A‘o£oV  — 


dt 


dt 


h0£0 


fd2A\ 


Using  the  vector  identity  V x (V  x A)  = V (V  ■ A)  — V2A  allows  the  above  equation  to  be  rewritten  as 


V2A— /i0£o 


(d2A\ 

yw) 


V V ■ A+fj,0eo 


dv 

~dt 


= A^oJ 


(Y) 


The  use  of  the  scalar  potential  V and  vector  potential  A leads  to  two  coupled  equations  X and  Y.  These 
coupled  equations  can  be  transformed  into  two  uncoupled  equations  by  exploiting  the  freedom  to  make  a gauge 
transformation  for  the  vector  potential  such  that  the  middle  brackets  in  both  equations  X and  Y are  zero. 
That  is,  choosing  the  Lorentz  gauge 


V • A = -p0£0 


simplifies  equations  X and  Y to  be 


V2A-p 


_P_ 

£o 

-hoi 


The  virtue  of  using  the  Lorentz  gauge,  rather  than  the  Coulomb  gauge  V ■ A = 0,  is  that  it  separates  the 
equations  for  the  scalar  and  vector  potentials.  Moreover,  these  two  equations  are  the  wave  equations  for  these 
two  potential  fields  corresponding  to  a velocity  c = • This  example  illustrates  the  power  of  using  the 

concept  of  potentials  in  describing  vector  fields. 


Appendix  I 


Waveform  analysis 


1.1  Harmonic  waveform  decomposition 

Any  linear  system  that  is  subject  to  a time-dependent  forcing  function  F(t),  can  be  expressed  as  a linear 
superposition  of  frequency-dependent  solutions  of  the  individual  harmonic  decomposition  a(co)  of  the  forcing 
function.  Similarly,  any  linear  system  subject  to  a spatially-dependent  forcing  function  F(x)  can  be  expressed 
as  a linear  superposition  of  the  wavenumber-dependent  solutions  of  the  individual  harmonic  decomposition 
a(kx)  of  the  forcing  function.  Fourier  analysis  provides  the  mathematical  procedure  for  the  transformation 
between  the  periodic  waveforms  and  the  harmonic  content,  that  is,  F(t)  a(u>),  or  F(x)  a(kx).  Fourier’s 
theorem  states  that  any  arbitrary  forcing  function  F(t)  can  be  decomposed  into  a sum  of  harmonic  terms. 
For  example  for  a time-dependent  periodic  forcing  function  the  decomposition  can  be  a cosine  series  of  the 
form 

OO 

m = E an  cos(nu>of  + </>„)  (1.1) 

n= 1 

where  loq  is  the  lowest  (fundamental)  frequency  solution.  For  an  aperiodic  function  a cosine  decomposition 
can  be  of  the  form 

/-•OO 

F(t)  = / a (u>)  cos(uit  + <j>  (tu))duj  (1.2) 

Jo 

Either  of  the  complementary  functions  F(t)  a(w),  or  F(x)  a(kx)  are  equivalent  representations  of 
the  harmonic  content  that  can  be  used  to  describe  signals  and  waves.  The  following  two  sections  give  an 
introduction  to  Fourier  analysis. 

1.1.1  Periodic  systems  and  the  Fourier  series 

Discrete  solutions  occur  for  systems  when  periodic  boundary  conditions  exist.  The  response  of  periodic 
systems  can  be  described  in  either  the  time  versus  angular  frequency  domains,  or  equivalently,  the  spatial 
coordinate  x versus  the  corresponding  wave  number  kx.  For  periodic  systems  this  decomposition  leads  to 
the  Fourier  series  where  a generalized  phase  coordinate  4>  can  be  used  to  represent  either  the  time  or  spatial 
coordinates,  that  is,  with  <j>  = uo t or  <f>  = kxx  respectively.  The  Fourier  series  relates  the  two  representations 
of  the  discrete  wave  solutions  for  such  periodic  systems. 

Fourier’s  theorem  states  that  for  a general  periodic  system  any  arbitrary  forcing  function  F(<f>)  can  be 
decomposed  into  a sum  of  sinusoidal  or  cosinusoidal  terms.  The  summation  can  be  represented  by  three 
equivalent  series  expansions  given  below,  where  </>  = oj^t  or  <j>  = k0-r,  and  where  cco,k0  are  the  fundamental 
angular  frequency  and  fundamental  wave  number  respectively. 

OO 

f {</>)  = y + E [°a  cos  + bn  sin  (L3) 

n—  1 

oo 

/ 0)  = + E Cn  cos  + ^n)  (L4) 

n= 0 
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f (0)  = y + E dn  sin  ^ + 


(1.5) 


n=0 


where  n is  an  integer,  and  pn,6n  are  phase  shifts  fit  to  the  initial  conditions. 

The  normal  modes  of  a discrete  system  form  a complete  set  of  solutions  that  satisfy  the  following  orthog- 
onality relation 


2t r 


fn  (^)  fm  (0)  d(f)  Cn^r) 


(1.6) 


Jo 


where  Smn  is  the  Kronecker  delta  symbol  defined  in  equation  (A.  10).  Orthogonality  can  be  used  to  determine 
the  coefficients  for  equations  (7.3)  to  be 


ao  = — 


— 


l r+n 

7T  _ 

1 /*+7r 
7 r 
1 
7 r 


— 7T 
/>+7T 


/ (0)  # 

/ (0)  cos  (n<^>)  dtp 
f (c t ))  sin  (ruj))  dcj) 


(1.7) 

(1.8) 

(1.9) 


Similarly  the  coefficients  for  (7.4)  and  (7.5)  are  related  to  the  above  coefficients  by 


cl=d2n  = al+bl 


Instead  of  the  simple  trigonometric  form  used  in  equations  (7.3  — 7.5)  the  cosine  and  sine  functions  can 
be  expanded  into  the  exponential  form  where 

cos  (f)  = i(e^  + e-^)  (1. 10) 

sin  (j>  = [e1^  — e~ 

then  equation  (7.3)  becomes 

OO 

f (</>)=  E 9nein* 

n=— oo 

where  n is  any  integer  and,  from  the  orthogonality,  the  Fourier  coefficients  are  given  by 

5n  = ^|_+"/(0)e^#  (1.12) 

These  coefficients  are  related  to  the  cosine  plus  sine  series  amplitudes  by 

( when  n is  positive) 
(when  n is  negative) 

These  results  show  that  the  coefficients  of  the  exponential  series  are  in  general  complex,  and  that  they 
occur  in  conjugate  pairs  (that  is,  the  imaginary  part  of  a coefficient  an  is  equal  but  opposite  in  sign  to  that 
for  the  coefficient  a_n).  Although  the  introduction  of  complex  coefficients  may  appear  unusual,  it  should 
be  remembered  that  the  real  part  of  a pair  of  coefficients  denotes  the  magnitude  of  the  cosine  wave  of  the 
relevant  frequency,  and  that  the  imaginary  part  denotes  the  magnitude  of  the  sine  wave.  If  a particular 
pair  of  coefficients  an  and  a_„  are  real,  then  the  component  at  the  frequency  ncoo  is  simply  a cosine;  if  an 
and  a-n  are  purely  imaginary,  the  component  is  just  a sine;  and  if,  as  is  the  general  case,  an  and  a_ra  are 
complex,  both  cosine  and  a sine  terms  are  present. 

The  use  of  the  exponential  form  of  the  Fourier  series  gives  rise  to  the  notion  of  ‘negative  frequency’.  Of 
course,  / ( t ) = an  cos  iunt  is  a wave  of  a single  frequency  = ncoo  radians/second,  and  may  be  represented 


9n  ^ (jJn  ibn) 

9n  7)  (®n  "b  ^7n) 


(111) 
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by  a single  line  of  height  an  in  a normal  spectral  diagram.  However,  using  the  exponential  form  of  the  Fourier 
series  results  in  both  positive  and  negative  u)  components. 

The  coexistence  of  both  negative  and  positive  angular  frequencies  ±w  can  be  understood  by  consideration 
of  the  Argand  diagram  where  the  real  component  is  plotted  along  the  x-axis  and  the  imaginary  component 
along  the  y-axis.  The  function  gne+lut  represents  a vector  of  length  gn  that  rotates  with  an  angular  velocity  w 
in  a positive  direction,  that  is  counterclockwise,  whereas,  gne~lut  represents  the  vector  rotating  in  a negative 
direction,  that  is  clockwise.  Thus  the  sum  of  the  two  rotating  vectors,  according  to  equations  (1. 3),  leads 
to  cancellation  of  the  opposite  components  on  the  imaginary  y axis  and  addition  of  the  two  gn  cos  uit  real 
components  on  the  x axis.  Subtraction  leads  to  cancellation  of  the  real  x components  and  addition  of  the 
imaginary  y axis  components. 


1.1.2  Aperiodic  systems  and  the  Fourier  Transform 

The  Fourier  transform  (also  called  the  Fourier  integral)  does  for  the  non-repetitive  signal  waveform  what 
the  Fourier  series  does  for  the  repetitive  signal.  It  was  shown  that  the  line  spectrum  of  a recurrent  periodic 
pulse  waveform  is  modified  as  the  pulse  duration  decreases,  assuming  the  period  of  the  waveform  (and  hence 
its  fundamental  component)  remains  unchanged.  Suppose  now  that  the  duration  of  the  pulses  remain  fixed 
but  the  separation  between  them  increases,  giving  rise  to  an  increasing  period.  In  the  limit,  only  a single 
rectangular  pulse  remains,  its  neighbors  having  moved  away  on  either  side  towards  ±oo.  In  this  case,  the 
fundamental  frequency  u>o  tends  towards  zero  and  the  harmonics  become  extremely  closely  spaced  and  of 
vanishingly  small  amplitudes,  that  is,  the  system  approximates  a continuous  spectrum. 

Mathematically,  this  situation  may  be  expressed  by  modifications  to  the  exponential  form  of  the  Fourier 
series  already  derived.  Let  the  phase  factor  cf>  = ui^t  in  equation  (7.11)  then 

gn  = ^ p f ( t ) en^dt  = i f ; f (: t ) enuotdt  (1.13) 

where  r is  the  period  of  the  periodic  force.  Let  G [oo)  = rgn,  u>  = mo o,  and  take  the  limit  for  r - 
equation  (7.  12)  can  be  written  as 

/+oo 

/ (i)  ewtdt 

-oo 

Similarly  making  the  same  limit  for  r — > oo  then  uio  = — — > dco  and  equation  (Til)  becomes 

OO  / \ OO  r,  _|_oq 

f(t)=  Y VC^iei™ot=  y G^yeiut=  G(co)eiutcLo  (1.15) 

' T ' Z7T  Z7T  /_nn 


oo,  then 
(1.14) 


Equation  (7.15)  shows  how  a non-repetitive  time-domain  wave  form  is  related  to  its  continuous  spectrum. 
These  are  known  as  Fourier  integrals  or  Fourier  transforms.  They  are  of  central  importance  for  signal 
processing.  For  convenience  the  transforms  often  are  written  in  the  operator  formalism  using  the  T symbol 
in  the  form 


1 ['+°°  T 1 

fit)  = —J  G (co)  ewtdu>  = T~x  —G(co)  (1.16) 

/+oo 

/ (i)  e-^dt  = Ff{t)  (1.17) 

-OO 


It  is  very  important  to  grasp  the  significance  of  these  two  equations.  The  first  tells  us  that  the  Fourier 
transform  of  the  waveform  /(f)  is  continuously  distributed  in  the  frequency  range  between  oj  = ±oo,  whereas 
the  second  shows  how,  in  effect,  the  waveform  may  be  synthesized  from  an  infinite  set  of  exponential  functions 
of  the  form  e±lwt,  each  weighted  by  the  relevant  value  of  G(u>).  It  is  crucial  to  realize  that  this  transformation 
can  go  either  way  equally,  that  is,  from  G(tu)  to  / (f)  or  vice  versa.1 

1 The  only  asymmetry  in  the  Fourier  transform  relations  comes  from  the  2tt  factor  originating  from  the  fact  that  by  convention 
physicists  use  the  angular  frequency  co  = 2iri/  rather  than  the  frequency  v.  In  order  to  restore  symmetry  many  papers  use  the 
factor  -^7=  in  both  relations  rather  than  using  the  ^ factor  in  equation  7.16  and  unity  in  equation  7.17. 
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1.1  Example:  Fourier  transform  of  a single  isolated  square  pulse: 

Consider  a single  isolated  square  pulse  of  width  r that  is  described  by  the  rectangular  function  II  defined 


as 


n(t)  = 


1*1  < i 
1*1  > i 


That  is,  assume  that  the  amplitude  of  the  pulse  is  unity  between  —L<t<  | . Then  the  Fourier  transform 


r+T 


G (lo)  = / l.e~tutdt  = - 


sm  - 


COT 

2 


which  is  an  unnormalized  sinc{uT)  function.  Note  that  the  width  of  the  pulse  At  = ±|  leads  to  a frequency 
envelope  that  has  the  first  zeros  at  A oj  = ±-.  Thus  the  product  of  these  widths  At  ■ A to  = ±7r  which  is 
independent  of  the  width  of  the  pulse,  that  is  Aui  = which  is  an  example  of  the  uncertainty  principle 
which  is  applicable  to  all  forms  of  wave  motion. 


1.2  Example:  Fourier  transform  of  the  Dirac  delta  function: 

The  Dirac  delta  function,  S(t  — t'),  is  a pulse  of  extremely  short  duration  and  unit  area  at  t = tf  and  is 
zero  at  all  other  times.  That  is, 

/+oo 

5(t  — t')  dt 

-OO 

The  Dirac  function,  which  is  sometimes  referred  to  as  the  impulse  function,  has  many  important  appli- 
cations to  physics  and  signal  processing.  For  example,  a shell  shot  from  a gun  is  given  a mechanical  impulse 
imparting  a certain  momentum  to  the  shell  in  a very  short  time.  Other  things  being  equal,  one  is  interested 
only  in  the  impulse  imparted  to  the  shell,  that  is,  the  time  integral  of  the  force  accelerating  the  shell  in  the 
gun,  rather  than  the  details  of  the  time  dependence  of  the  force.  Since  the  force  acts  for  a very  short  time 
the  Dirac  delta  function  can  be  employed  in  such  problems. 

As  described  in  section  3.11  and  appendix  J,  the  Dirac  delta  function  is  employed  in  signal  processing 
when  signals  are  sampled  for  short  time  intervals.  The  Fourier  transform  of  the  delta  function  is  needed  for 
discussion  of  sampling  of  signals 


r+oo 

G(u)  = / S(t  — t')  e~iujtdt  = e~iuit' 


Since  e~lwt  essentially  is  constant  over  the  infinitesimal  time  duration  of  the  S(t  — t')  function,  and  the 
time  integral  of  the  S function  is  unity,  thus  the  term  e~lut  has  unit  magnitude  for  any  value  of  w and  has 
a phase  shift  of  —co  ( t — t')  radians.  For  t'  = 0 the  phase  shift  is  zero  and  thus  the  Fourier  transform  of  a 
Dirac  5(t)  function  is  G(co)  = 1.  That  is,  this  is  a uniform  white  spectrum  for  all  values  of  u>. 


1.2  Time-sampled  waveform  analysis 

An  alternative  approach  for  unloosing  periodic  signals,  that  is  complementary  to  the  Fourier  analysis  har- 
monic decomposition,  is  time-sampled  (discrete-sample)  waveform  analysis  where  the  signal  amplitude  is 
measured  repetitively  at  regular  time  intervals  in  a time-ordered  sequence,  that  is,  a sequence  of  samples  of 
the  instantaneous  delta-function  amplitudes  is  recorded.  Typically  an  amplitude-to-digital  converter  is  used 
to  digitize  the  amplitude  for  each  measured  sample  and  the  digital  numbers  are  recorded;  this  process  is 
called  digital  signal  processing. 

The  general  principles  are  best  explained  by  first  considering  the  response  of  a linear  system  to  a step 
function  impulse,  followed  by  a square  impulse,  and  leading  to  the  response  of  a 5-function  impulsive  driving 
force. 
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Figure  1.1:  Response  of  a underdamped  linear  oscillator  with  u>  = 10,  and  T = 2 to  the  following  impulsive 
force,  (a)  Step  function  force  F = 0 for  t < 0 and  F = m for  t > 0.  (b)  Square-wave  force  where  F = m for 
0 < t < t for  r = 3,  and  F = 0 at  other  times,  (c)  Delta-function  impulse  P = 1. 

1.2.1  Delta-function  impulse  response 

Consider  the  damped  oscillator  equation 


x + Tx  + ui^x 


F(t) 

m 


and  assume  that  a step  function  is  applied  at  time  t = 0.  That  is; 


F(t) 

m 


= 0 


t < 0 


F(t) 

m 


t > 0 


(1.18) 


(1.19) 


where  a is  a constant.  The  initial  conditions  are  that  x(0)  = i(0)  = 0. 

The  transient  or  complementary  solution  is  the  solution  of  the  linearly-damped  harmonic  oscillator 

x + Ti:  + co^x  = 0 (1.20) 


This  is  independent  of  the  driving  force  and  the  solution  is  given  in  the  chapter  3.5  discussion  of  the  linearly- 
damped  harmonic  oscillator. 

The  particular,  steady-state,  solution  is  easy  to  obtain  just  by  inspection  since  the  force  is  a constant, 
that  is,  the  particular  solution  is 


xs  = 


t > 0 


xs  = 0 t < 0 


Taking  the  sum  of  the  transient  and  particular  solutions,  using  the  initial  conditions,  gives  the  final  solution 
to  be 


x{t)  = 


UJn 


-£t  re-5*  . 

1 — e 2 coswit sm wit 

2wi 


(1.21) 


where  uq  = y Wq  — (^)b  This  functional  form  is  shown  in  figure  /.la.  Note  that  the  amplitude  of  the 
transient  response  equals  —a  at  t = 0 to  cancel  the  particular  solution  when  it  jumps  to  +a.  The  oscillatory 
behavior  then  is  just  that  of  the  transient  response. 

A square  impulse  can  be  generated  by  the  superposition  of  two  opposite-sign  stepfunctions  separated  by 
a time  r as  shown  in  figure  I. lb. 

The  square  impulse  can  be  taken  to  the  limit  where  the  width  r is  negligibly  small  relative  to  the  response 
times  of  the  system.  It  can  be  shown  that  letting  r — > 0,  but  keeping  the  magnitude  of  the  total  impulse 
P = cit  finite  for  the  impulse  at  time  to,  leads  to  the  solution  for  the  5-function  impulse  occurring  at  to 


x(t)  = 


—e  2(4  sinwi  (t  - to) 

idl 


t>  to 


(1.22) 


This  response  to  a delta  function  impulse  is  shown  in  figure  I.lc  for  the  case  where  to  = 0.  An  example  is 
the  response  when  the  hammer  strikes  a piano  string  at  t = 0. 
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x(t) 


Figure  1.2:  Decomposition  of  the  function  x (t)  = 2 sin  (f)+sin  (5t)  + | sin  (15f)  + | sin(25f)  into  a time-ordered 
sequence  of  5-function  samples. 

1.2.2  Green’s  function  waveform  decomposition 

The  response  of  the  linearly-damped  linear  oscillator  to  an  delta  function  impulse,  that  has  been  expressed 
above,  can  be  used  to  exploit  the  powerful  Green’s  technique  for  decomposition  of  any  general  forcing 
function.  That  is,  if  the  driven  system  is  linear,  then  the  principle  of  superposition  is  applicable  and  allowing 
expression  of  the  inhomogeneous  part  of  the  differential  equation  as  the  sum  of  individual  delta  functions. 
That  is; 


°°  f1  °° 

x + Yx  + ulx=  Y.  = J«(*)  (1.23) 

n=— oo  n=— oo 

As  illustrated  in  figure  1.2  discrete-time  waveform  analysis  involves  repeatedly  sampling  the  instantaneous 
amplitude  in  a regular  and  repetitive  sequence  of  5-function  impulses.  Since  the  superposition  principle 
applies  for  this  linear  system  then  the  waveform  can  be  described  by  a sum  of  an  ordered  series  of  delta- 
function  impulses  where  t'  is  the  time  of  an  impulse.  Integrating  over  all  the  5-function  responses  that  have 
occurred  at  time  t' , that  is  prior  to  the  time  of  interest  t,  leads  to 

x(t)=  [ ^ ^ ^ ) sinuq  (t  — t ')  dt'  t>t'  (1-24) 

J- OO  muJl 

The  Green’s  function  G (t  — t')  is  defined  by 

G(t  — t')  = — — ) sinwi  (f  — t')  t>t'  (1.25) 

mco  i 

= 0 t < t' 

Superposition  allows  the  summed  response  of  the  system  to  be  written  in  an  integral  form 

x(t)=  [ F(t')G(t  — t')dt'  (1.26) 

which  gives  the  final  time  dependence  of  the  forced  system.  This  repetitive  time-sampling  approach  avoids 
the  need  of  using  Fourier  analysis.  Note  that  the  Green’s  function  G (t  — t')  includes  implicitly  the  frequency 

of  the  free  undamped  linear  oscillator  cvo,  the  free  damped  linear  oscillator  uj\  = Wq  — (£)",  as  well  as  the 
damping  coefficient  F.  Access  to  the  combination  of  fast  microcomputers  coupled  to  fast  digital  sampling 
techniques  has  made  digital  signal  sampling  the  pre-eminent  technique  for  signal  recording  of  audio,  video, 
and  detector  signal  processing. 


Bibliography 


[1]  SELECTION  OF  TEXTBOOKS  ON  CLASSICAL  MECHANICS 

[Ar78]  V.  I.  Arnold,  "Mathematical  methods  of  Classical  Mechanics",  2nd  edition,  Springer- Verlag  (1978) 

This  textbook  provides  an  elegant  and  advanced  exposition  of  classical  mechanics  expressed  in 
the  language  of  differential  topology. 

[Co50]  H.C.  Corben  and  P.  Stehle,  "Classical  Mechanics",  John  Wiley  (1950) 

This  classic  textbook  covers  the  material  at  the  same  level  and  comparable  scope  as  the  present 
textbook. 

[Fo05]  G.  R.  Fowles,  G.  L.  Cassiday,  "A7ialytical  Mechanics".  Thomson  Brookes/Cole,  Belmont,  (2005) 
An  elementary  undergraduate  text  that  emphasizes  computer  simulations. 

[Go50]  H.  Goldstein,  "Classical  Mechanics",  Addison- Wesley,  Reading  (1950) 

This  has  remained  the  gold  standard  graduate  textbook  in  classical  mechanics  since  1950.  Gold- 
stein’s book  is  the  best  graduate-level  reference  to  supplement  the  present  textbook.  The  lack  of 
worked  examples  is  an  impediment  to  using  Goldstein  for  undergraduate  courses.  The  3rd  edition, 
published  by  Goldstein,  Poole,  and  Safko  (2002),  uses  the  symplectic  notation  that  makes  the 
book  less  friendly  to  undergraduates.  The  Cline  book  adopts  the  nomenclature  used  by  Goldstein 
to  provide  a consistent  presentation  of  the  material. 

[Gr06]  R.  D.  Gregory,  "Classical  Mechanics",  Cambridge  University  Press 

This  outstanding,  and  original,  introduction  to  analytical  mechanics  was  written  by  a mathemati- 
cian. It  is  ideal  for  the  undergraduate,  but  the  breadth  of  the  material  covered  is  limited. 

[GrlO]  W.  Greiner,  "Classical  Mechanics,  Systems  of  particles  and  Hamiltonian  Dynamics" , 2nd  edition, 
Springer  (2010).  This  excellent  modern  graduate  textbook  is  similar  in  scope  and  approach  to 
the  present  text.  Greiner  includes  many  interesting  worked  examples,  as  well  as  a reproduction 
of  the  Struckmeier[Str08]  presentation  of  the  extended  Lagrangian  and  Hamiltonian  mechanics 
formalism  of  Lanczos[La49]. 

[Jo98]  J.  V.  Jose  and  E.  J.  Saletan,  "Classical  Dynamics,  A Contemporary  Approach",  Cambridge 
University  Press  (1998) 

This  modern  advanced  graduate- level  textbook  emphasizes  configuration  manifolds  and  tangent 
bundles  which  makes  it  unsuitable  for  use  by  most  undergraduate  students. 

[Jo05]  O.  D.  Johns,  "Analytical  Mechanics  for  Relativity  and  Quantum  Mechanics",  2nd  edition,  Ox- 
ford University  Press  (2005).  Excellent  modern  graduate  text  that  emphasizes  the  Lanczos[La49] 
parametric  approach  to  Special  Relativity.  The  Johns  and  Cline  textbooks  were  developed  inde- 
pendently but  are  similar  in  scope  and  approach.  For  consistency,  the  name  "generalized  energy", 
which  was  introduced  by  Johns,  has  been  adopted  in  the  Cline  textbook. 


551 


552 


BIBLIOGRAPHY 


[Ki85]  T.W.B.  Kibble,  F.H.  Berkshire.  "Classical  Mechanics,  (5th  edition)",  Imperial  College  Press, 
London,  2004.  Based  on  the  textbook  written  by  Kibble  that  was  published  in  1966  by  McGraw- 
Hill.  The  4th  and  5th  editions  were  published  jointly  by  Kibble  and  Berkshire.  This  excellent 
and  well-established  textbook  addresses  the  same  undergraduate  student  audience  as  the  present 
textbook.  This  book  covers  the  variational  principles  and  applications  with  minimal  discussion  of 
the  philosophical  implications  of  the  variational  approach. 

[La49]  C.  Lanczos,  "The  Variational  Principles  of  Mechanics",  University  of  Toronto  Press,  Toronto, 
(1949) 

An  outstanding  graduate  textbook  that  has  been  one  of  the  founding  pillars  of  the  held  since 
1949.  It  gives  an  excellent  introduction  to  the  philosophical  aspects  of  the  variational  approach 
to  classical  mechanics,  and  introduces  the  extended  formulations  of  Lagrangian  and  Hamiltonian 
mechanics  that  are  applicable  to  relativistic  mechanics. 

[La60]  L.  D.  Landau,  E.  M.  Lifshitz,  "Mechanics" , Volume  1 of  a Course  in  Theoretical  Physics,  Perga- 
mon  Press  (1960) 

An  outstanding,  succinct,  description  of  analytical  mechanics  that  is  devoid  of  any  superfluous 
text.  This  Course  in  Theoretical  Physics  is  a masterpiece  of  scientific  writing  and  is  an  essential 
component  of  any  physics  library.  The  compactness  and  lack  of  examples  makes  this  textbook 
less  suitable  for  most  undergraduate  students. 

[Li94]  Yung-Kuo  Lim,  " Problems  and  Solutions  on  Mechanics"  (1994) 

This  compendium  of  408  solved  problems,  which  are  taken  from  graduate  qualifying  examinations 
in  physics  at  several  U.S.  universities,  provides  an  invaluable  resource  that  complements  this 
textbook  for  study  of  Lagrangian  and  Hamiltonian  mechanics. 

[Ma65]  .1.  B.  Marion,  "Classical  Dynamics  of  Particles  and  Systems",  Academic  Press,  New  York,  (1965) 

This  excellent  undergraduate  text  played  a major  role  in  introducing  analytical  mechanics  to 
the  undergraduate  curriculum.  It  has  an  outstanding  collection  of  challenging  problems.  The  5th 
edition  has  been  published  by  S.  T.  Thornton  and  J.  B.  Marion,  Thomson,  Belmont,  (2004). 

[Me70]  L.  Meirovitch,  "Methods  of  Analytical  Dynamics",  McGraw-Hill  New  York,  (1970) 

An  advanced  engineering  textbook  that  emphasizes  solving  practical  problems,  rather  than  the 
underlying  theory. 

[Mu08]  H.  J.  W.  Mtiller-Kirsten,  "Classical  Mechanics  and  Relativity",  World  Scientific,  Singapore,  (2008) 

This  modern  graduate-level  textbook  emphasizes  relativistic  mechanics  making  it  an  excellent 
complement  to  the  present  textbook. 

[Pe82]  I.  Percival  and  D.  Richards,  "Introduction  to  Dynamics " Cambridge  University  Press,  London, 
(1982) 

Provides  a clear  presentation  of  Lagrangian  and  Hamiltonian  mechanics,  including  canonical 
transformations,  Hamilton- Jacobi  theory,  and  action-angle  variables. 

[Sy60]  J.L.  Synge,  "Principles  of  Classical  Mechanics  and  Field  Theory"  , Volume  III/I  of  "Handbuck 
der  Physik"  Springer- Verlag,  Berlin  (1960). 

A classic  graduate-level  presentation  of  analytical  mechanics. 

[Ta05]  .1.  R.  Taylor,  "Classical  Mechanics",  University  Science  Books,  Sausalito,  (2006) 

This  undergraduate  book  gives  a well-written  descriptive  introduction  to  analytical  mechanics. 
The  scope  of  the  book  is  limited  and  the  problems  are  easy. 


BIBLIOGRAPHY 


553 


[2]  GENERAL  REFERENCES 

[Bak96]  L.  Baker,  J.P.  Gollub,  "Chaotic  Dynamics ",  2nd  edition,  1996  (Cambridge  University  Press) 

[Bat31]  H.  Bateman,  Phys.  Rev.  38  (1931)  815 

[Bau31]  P.S.  Bauer,  Proc.  Natl.  Acad.  Sci.  17  (1931)  311 

[Bor25a]  M.  Born  and  P.  Jordan,  Zur  Quantenmechanik , Zeitschrift  fiir  Physik,  34,  (1925)  858-888. 

[Bor25b]  M.  Born,  W.  Heisenberg,  and  P.  Jordan,  Zur  Quantenmechanik  II,  Zeitschrift  fiir  Physik,  35, 
(1925),  557-615, 

[Boy08]  R.  W.  Boyd,  "Nonlinear  Optics",  3rd  edition,  2008  (Academic  Press,  NY) 

[Bril4]  L.  Brillouin,  Ann.  Physik  44(1914) 

[Bri60]  L.  Brillouin,  "Wave  Propagation  and  Group  Velocity",  1960  (Academic  Press,  New  York) 

[CeilO]  J.L.  Cieslinski,  T.  Nikiciuk,  J.  Phys.  A:Math.  Theor.  43  (2010)  175205 

[Cio07]  Ciocci  and  Langerock,  "Regular  and  Chaotic  Dynamics",  12  (2007)  602 

[Cli71]  D.  Cline,  Proc.  Orsay  Coll,  on  Intermediate  Nuclei,  Ed.  Foucher,  Perrin,  Veneroni,  4 (1971). 

[Cli72]  D.  Cline  and  C.  Flaum,  Proc.  of  the  Int.  Conf.  on  Nuclear  Structure  Studies  Using  Electron 

Scattering,  Sendai,  Ed.  Shoa,  Ui,  61  (1972). 

[Cli86]  D.  Cline,  Ann.  Rev.  Nucl.  Part.  Sci.  36,  (1986)  683. 

[Coh77]  R.J.  Cohen,  Arner.  J.  of  Phys.  45  (1977)  12 

[Cra65]  F.S.  Crawford,  "Berkeley  Physics  Course  3;  "Waves",  1970  (Me  Craw  Hill,  New  York) 

[Cum07]  D.  Cumin,  C.P.  Unsworth,  Physica  D 226  (2007)  181 

[Dav58]  A.  S.  Davydov  and  G.  F.  Filippov.  Nuclear  Physics,  8 (1958)  237 

[Dek75]  H.  Dekker,  Z.  Physik,  B21  (1975)  295 

[Dep67]  A.  Deprit,  American  J.  of  Phys  35,  no. 5 424  (1967) 

[Dir30]  P.A.M.  Dirac,  " Quantum  Mechanics",  Oxford  University  Press,  (1930). 

[Dou41]  D.  Douglas,  Trans.  Am.  Math.  Soc.  50  (1941)  71 

[Fey84]  R.P.  Feynman,  R.B.  Leighton,  M.  Sands,  The  Feynman  Lectures,  (Addison- Wesley,  Reading, 
MA,1984)  Vol.  2,  P17.5 

[Fro80]  C.  Frohlich,  Scientific  American,  242  (1980)  154 

[Har03]  James  B.  Hartle,  "Gravity:  An  Introduction  to  Einstein’s  General  Relativity"  (Addison  Wesley, 
2003) 

[Kur75]  International  Symposium  on  Math.  Problems  in  Theoretical  Physics,  Lecture  Notes  in  Physics, 
Vol39  Springer,  NY  (1975) 

[Mus08a]  Z.E.  Musielak,  J.  Phys.  A.  Math.  Theor.  41  (2008)  055205 

[Mus08b]  Z.E.  Musielak,  D.  Rouy,  L.D.  Swift,  Chaos,  Solitons,  Fractals  38  (2008)  894 

[Rayl887]  J.W.  Strutt,  3rd  Baron  Rayleigh,  " The  Theory  of  Sound",  1887  (Macmillan,  London) 

[Roul860]  E.J.  Routh,  " Treatise  on  the  dynamics  of  a system  of  rigid  bodies",  MacMillan  (1860) 

[Sim98]  M.  Simon,  D.  Cline,  K.  Vetter,  et  al,  Unpublished 


554 


BIBLIOGRAPHY 


[Sta05]  T.  Stachowiak  and  T.  Okada,  Chaos,  Solitons,  and  Fractals,  29  (2006)  417. 

[StrOO]  S.H.  Strogatz,  Physica  D43  (2000)  1 

[Str05]  J.  Struckmeier,  J.  Phys.  A:  Math;  Gen.  38  (2005)  1257 

[Str08]  J.  Struckmeier,  Int.  J.  of  Mod.  Phys.  E18  (2008)  79 

[Win67]  A.T.  Winfree,  J.  Theoretical  Biology  16  (1967)  15 


Index 


Abbreviated  action,  384 
Action 

abbreviated  action,  384 
Hamilton’s  principle,  382 
Action-angle  variables 

Hamilton-Jacobi  theory,  423 
Sommerfeld  atom,  485 
Adiabatic  invariance 

action  variables,  426 
plane  pendulum,  426 
Analytical  mechanics,  xviii 
Androyer-Deprit  variables 
rigid-body  rotation,  315 
Archimedes 
history,  2 
Aristotle 

history,  1 
Asymmetric  rotor 

stability  of  torque-free  rotation,  322 
Asymmetric  top 

5 somersaults  plus  3 rotations  of  high  diver,  334 
separatrix,  322 
tennis  racket  motion,  323 
torque- free  rotation,  321 
Attractor 

van  der  Pol  oscillator,  94 
Autonomous  system,  93,  169 

Barycenter,  229 
Bernoulli 

history,  4 

principle  of  virtual  work,  138 
virtual  work,  111 
Bertrand’s  Theorem 
orbit  stability,  241 
Bertrand’s  theorem 
orbit  solution,  234 
Bicycle  stability 

rolling  wheel,  331 
Bifurcation 

non-linear  system,  103 
Billiard  ball,  15 
Bohr 

history,  7 

model  of  the  atom,  485 
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history,  3 
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Lagrangian  approach  to  quantum  mechanics,  490 
Poisson  brackets  in  quantum  physics,  398,  487 
relativistic  quantum  theory,  490 
Discrete  lattice  chain 
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Driven  damped  oscillator 

absorptive  amplitude,  66,  84 

arbitrary  periodic  harmonic  force,  71 

elastic  amplitude,  66,  84 

energy  absorption,  65 

Green’s  method,  550 

harmonically  driven,  62 
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analytic  solution,  37 
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history,  4 
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Four  vector 
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momentum  energy,  468 
scalar  product,  466 
Four-dimensional  space-time 
Riemannian  geometry,  479 
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Fourier 
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Fourier  transform,  547 
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Fourier  transform 

Dirac  delta  function,  548 
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linearly-damped  linear  oscillator,  70 
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Galilean  invariance,  10 
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history,  2 
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General  theory  of  relativity 
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principle  of  covariance,  478 
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Generalized  energy  theorem 

Hamiltonian  mechanics,  187 
Generalized  force,  139,  144 
Generalized  momentum,  180 
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Gilbert 
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Gravitation,  38 

conservative,  39 
curl,  41 

determination  of  field  from  potential,  41 
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Newton’s  laws,  44 
Poisson’s  equation,  45 
potential,  40 
potential  energy,  39 
potential  theory,  41 
reference  potential,  42 
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Green’s  function  method,  550 
Group  velocity 
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Hamilton 

history,  5,  495 
variational  principle,  111 
Hamilton’s  equations  of  motion 
canonical  transformations,  407 
Hamilton’s  Principle 

Hamilton- Jacobi  equation,  384 
Hamilton’s  principle 

Lagrange  equations,  141,  200,  381,  532 
least  action,  382 
Hamilton’s  principle  function 
Hamilton- Jacobi  theory,  413 
Hamilton- Jacobi  equation 
Hamilton’s  Principle,  384 
Hamilton- Jacobi  theory,  412 
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Hamilton’s  characteristic  function,  414 
Hamilton’s  principle  function,  413 
Hamilton- Jacobi  formulations,  414 
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Schrodinger  equation,  490 
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